All of lore.kernel.org
 help / color / mirror / Atom feed
From: James Harper <james.harper@bendigoit.com.au>
To: Sage Weil <sage@inktank.com>
Cc: "ceph-devel@vger.kernel.org" <ceph-devel@vger.kernel.org>,
	"Sylvain Munaut (s.munaut@whatever-company.com)"
	<s.munaut@whatever-company.com>
Subject: RE: debugging librbd async - valgrind memtest hit
Date: Fri, 30 Aug 2013 23:39:00 +0000	[thread overview]
Message-ID: <6035A0D088A63A46850C3988ED045A4B664B6EA2@BITCOM1.int.sbss.com.au> (raw)
In-Reply-To: <alpine.DEB.2.00.1308300820380.24783@cobra.newdream.net>

> 
> On Fri, 30 Aug 2013, James Harper wrote:
> > I finally got a valgrind memtest hit... output attached below email. I
> > recompiled all of tapdisk and ceph without any -O options (thought I had
> > already...) and it seems to have done the trick
> 
> What version is this?  The line numbers don't seem to match up with my
> source tree.

0.67.2, but I've peppered it with debug prints

> > Basically it looks like an instance of AioRead is being accessed after
> > being free'd. I need some hints on what api behaviour by the tapdisk
> > driver could be causing this to happen in librbd...
> 
> It looks like refcounting for the AioCompletion is off.  My first guess
> would be premature (or extra) calls to rados_aio_release or
> AioCompletion::release().
> 
> I did a quick look at the code and it looks like aio_read() is carrying a
> ref for the AioComplete for the entire duration of the function, so it
> should not be disappearing (and taking the AioRead request struct with it)
> until well after where the invalid read is.  Maybe there is an error path
> somewhere what is dropping a ref it shouldn't?
> 

I'll see if I can find a way to track that. It's the c->get() and c->put() that track this right?
 
The crash seems a little bit different every time, so it could still be something stomping on memory, eg overwriting the ref count or something.

Thanks

James


      reply	other threads:[~2013-08-30 23:39 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-08-30 11:11 debugging librbd async - valgrind memtest hit James Harper
2013-08-30 15:23 ` Sage Weil
2013-08-30 23:39   ` James Harper [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6035A0D088A63A46850C3988ED045A4B664B6EA2@BITCOM1.int.sbss.com.au \
    --to=james.harper@bendigoit.com.au \
    --cc=ceph-devel@vger.kernel.org \
    --cc=s.munaut@whatever-company.com \
    --cc=sage@inktank.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.