Re: Segmentation fault with drgn + libkdumpfile

From: "Petr Tesařík" <petr@tesarici.cz>
To: Stephen Brennan <stephen.s.brennan@oracle.com>
Cc: Omar Sandoval <osandov@osandov.com>, linux-debuggers@vger.kernel.org
Subject: Re: Segmentation fault with drgn + libkdumpfile
Date: Tue, 9 Jan 2024 10:06:09 +0100	[thread overview]
Message-ID: <20240109100609.4e956beb@meshulam.tesarici.cz> (raw)
In-Reply-To: <20240108214008.32f807ee@meshulam.tesarici.cz>

On Mon, 8 Jan 2024 21:40:08 +0100
Petr Tesařík <petr@tesarici.cz> wrote:

> On Fri, 05 Jan 2024 13:53:15 -0800
> Stephen Brennan <stephen.s.brennan@oracle.com> wrote:
> 
> > Petr Tesařík <petr@tesarici.cz> writes:  
> > > On Fri, 05 Jan 2024 10:38:16 -0800
> > > Stephen Brennan <stephen.s.brennan@oracle.com> wrote:
> > >    
> > >> Hi Petr,
> > >> 
> > >> I recently encountered a segmentation fault with libkdumpfile & drgn
> > >> which appears to be related to the cache implementation. I've included
> > >> the stack trace at the end of this message, since it's a bit of a longer
> > >> one. The exact issue occurred with a test vmcore that I could probably
> > >> share with you privately if you'd like. In any case, the reproducer is
> > >> fairly straightforward in drgn code:
> > >> 
> > >> for t in for_each_task(prog):
> > >>     prog.stack_trace(t)
> > >> for t in for_each_task(prog):
> > >>     prog.stack_trace(t)
> > >> 
> > >> The repetition is required, the segfault only occurs on the second
> > >> iteration of the loop. Which, in hindsight, is a textbook sign that the
> > >> issue has to do with caching. I'd expect that the issue is specific to
> > >> this vmcore, it doesn't reproduce on others.
> > >> 
> > >> I stuck that into a git bisect script and bisected the libkdumpfile
> > >> commit that introduced it:
> > >> 
> > >> commit 487a8042ea5da580e1fdb5b8f91c8bd7cad05cd6
> > >> Author: Petr Tesarik <petr@tesarici.cz>
> > >> Date:   Wed Jan 11 22:53:01 2023 +0100
> > >> 
> > >>     Cache: Calculate eprobe in reinit_entry()
> > >> 
> > >>     If this function is called to reuse a ghost entry, the probe list
> > >>     has not been walked yet, so eprobe is left uninitialized.
> > >> 
> > >>     This passed the test case, because the correct old value was left
> > >>     on stack. Modify the test case to poison the stack.
> > >> 
> > >>     Signed-off-by: Petr Tesarik <petr@tesarici.cz>
> > >> 
> > >>  src/kdumpfile/cache.c      |  6 +++++-
> > >>  src/kdumpfile/test-cache.c | 13 +++++++++++++
> > >>  2 files changed, 18 insertions(+), 1 deletion(-)    
> > >
> > > This looks like a red herring to me. The cache most likely continues in
> > > a corrupted state without this commit, which may mask the issue (until
> > > it resurfaces later).    
> > 
> > I see, that makes a lot of sense.
> >   
> > >> I haven't yet tried to debug the logic of the cache implementation and
> > >> create a patch. I'm totally willing to try that, but I figured I would
> > >> send this report to you first, to see if there's something obvious that
> > >> sticks out to your eyes.    
> > >
> > > No, but I should be able to recreate the issue if I get a log of the
> > > cache API calls:
> > >
> > > - cache_alloc() - to know the number of elements
> > > - cache_get_entry()
> > > - cache_put_entry()
> > > - cache_insert()
> > > - cache_discard()
> > > - cache_flush() - not likely after initialization, but...    
> > 
> > I went ahead and logged each of these calls as you suggested, I tried to
> > log them at the beginning of the function call and always include the
> > cache pointer, cache_entry, and the key. I took the resulting log and
> > filtered it to just contain the most recently logged cache prior to the
> > crash, compressed it, and attached it. For completeness, the patch
> > I used is below (applies to tip branch 8254897 ("Merge pull request #78
> > from fweimer-rh/c99")).
> > 
> > I'll also see if I can reproduce it based on the log.  
> 
> Thank you for the log. I haven't had much time to look at it, but the
> first line is a good hint already:
> 
> 0x56098b68c4c0: cache_alloc(1024, 0)
> 
> Zero size means the data pointers are managed by the caller, so this
> must be the cache of mmap()'ed segments. That's the only cache which
> installs a cleanup callback with set_cache_entry_cleanup(). There is
> only one call to the cleanup callback for evicted entries in cache.c:
> 
> 		/* Get an unused cached entry. */
> 		if (cs->nuprobe != 0 &&
> 		    (cs->nuprec == 0 || cache->nprobe + bias > cache->dprobe))
> 			evict = evict_probe(cache, cs);
> 		else
> 			evict = evict_prec(cache, cs);
> 		if (cache->entry_cleanup)
> 			cache->entry_cleanup(cache->cleanup_data, evict);
> 
> The entries can be evicted from the probe partition or from the precious
> partition. This might be relevant. Please, can you re-run and log where
> the evict entry comes from?

I found some time this morning, and it wouldn't help. Because of a bug
in fcache_new(), the number of elements in the cache is big enough that
cache entries are never evicted in your case. It's quite weird to hit a
cache metadata bug after elements have been inserted. FWIW I am not
able to reproduce the bug by replaying the logged file read pattern.

Since you have a reliable reproducer, it cannot be a Heisenbug. But it
could be caused by the other cache - the cache of decompressed pages.
Do you know for sure that lzo1x_decompress_safe() crashes while trying
to _read_ from the input buffer, and not while trying to _write_ to the
output buffer?

Petr T