On Thu 23-06-11 16:19:08, Moffett, Kyle D wrote:
> On Jun 23, 2011, at 16:55, Sean Ryle wrote:
> > Maybe I am wrong here, but shouldn't the cast be to (unsigned long) or to (sector_t)?
> > 
> > Line 534 of commit.c:
> >                         jbd_debug(4, "JBD: got buffer %llu (%p)\n",
> >                                 (unsigned long long)bh->b_blocknr, bh->b_data);
> 
> No, that printk() is fine, the format string says "%llu" so the cast is
> unsigned long long.
> 
> Besides which, line 534 in the Debian 2.6.32 kernel I am using is this
> one:
> 
>   J_ASSERT(commit_transaction->t_nr_buffers <=
>            commit_transaction->t_outstanding_credits);
  Hmm, OK, so we've used more metadata buffers than we told JBD2 to
reserve. I suppose you are not using data=journal mode and the filesystem
was created as ext4 (i.e. not converted from ext3), right? Are you using
quotas?

> If somebody can tell me what information would help to debug this I'd be
> more than happy to throw a whole bunch of debug printks under that error
> condition and try to trigger the crash with that.
> 
> Alternatively I could remove that J_ASSERT() and instead add some debug
> further down around the "commit_transaction->t_outstanding_credits--;"
> to try to see exactly what IO it's handling when it runs out of credits.
  The trouble is that the problem is likely in some journal list shuffling
code because if just some operation wrongly estimated the number of needed
buffers, we'd fail the assertion in jbd2_journal_dirty_metadata():
J_ASSERT_JH(jh, handle->h_buffer_credits > 0);

The patch below might catch the problem closer to the place where it
happens...

Also possibly you can try current kernel whether the bug happens with it or
not.
								Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR