From: Jens Axboe <axboe@kernel.dk>
To: Johannes Hirte <johannes.hirte@fem.tu-ilmenau.de>
Cc: miaox@cn.fujitsu.com, Dave Chinner <david@fromorbit.com>,
Chris Mason <chris.mason@oracle.com>,
linux-kernel@vger.kernel.org, linux-btrfs@vger.kernel.org,
zheng.yan@oracle.com, linux-fsdevel@vger.kernel.org
Subject: Re: kernel BUG at fs/btrfs/extent-tree.c:1353
Date: Thu, 29 Jul 2010 20:54:58 +0200 [thread overview]
Message-ID: <4C51CE82.7040901@kernel.dk> (raw)
In-Reply-To: <201007291909.51978.johannes.hirte@fem.tu-ilmenau.de>
On 07/29/2010 07:09 PM, Johannes Hirte wrote:
> Am Donnerstag 22 Juli 2010, 20:07:23 schrieb Johannes Hirte:
>> Am Montag 19 Juli 2010, 10:01:46 schrieb Miao Xie:
>>> On Thu, 15 Jul 2010 20:14:51 +0200, Johannes Hirte wrote:
>>>> Am Donnerstag 15 Juli 2010, 02:11:04 schrieb Dave Chinner:
>>>>> On Wed, Jul 14, 2010 at 05:25:23PM +0200, Johannes Hirte wrote:
>>>>>> Am Donnerstag 08 Juli 2010, 16:31:09 schrieb Chris Mason:
>>>>>> I'm not sure if btrfs is to blame for this error. After the errors I
>>>>>> switched to XFS on this system and got now this error:
>>>>>>
>>>>>> ls -l .kde4/share/apps/akregator/data/
>>>>>> ls: cannot access .kde4/share/apps/akregator/data/feeds.opml:
>>>>>> Structure needs cleaning
>>>>>> total 4
>>>>>> ?????????? ? ? ? ? ? feeds.opml
>>>>>
>>>>> What is the error reported in dmesg when the XFS filesytem shuts down?
>>>>
>>>> Nothing. I double checked the logs. There are only the messages when
>>>> mounting the filesystem. No other errors are reported than the
>>>> inaccessible file and the output from xfs_check.
>>>
>>> Is there anything wrong with your disks or memory?
>>> Sometimes the bad memory can break the filesystem. I have met this kind
>>> of problem some time ago.
>>
>> I don't think that's the case. I've checked the RAM with memtest86+ and got
>> no errors. I got the errors with two different disks, the first one with
>> btrfs the second one now with XFS. Before changing to the second disk,
>> I've run badblocks on it to be sure it has no errors.
>
> I think I've found it. The bug was introduced by
>
> commit 7f0e7bed936a0c422641a046551829a01341dd80
> Author: Christoph Hellwig <hch@lst.de>
> Date: Tue Jun 8 18:14:34 2010 +0200
>
> writeback: fix writeback completion notifications
>
> The code dealing with bdi_work->state and completion of a bdi_work is a
> major mess currently. This patch makes sure we directly use one set of
> flags to deal with it, and use it consistently, which means:
>
> - always notify about completion from the rcu callback. We only ever
> wait for it from on-stack callers, so this simplification does not
> even cause a theoretical slowdown currently. It also makes sure we
> don't miss out on the notification if we ever add other callers to
> wait for it.
> - make earlier completion notification depending on the on-stack
> allocation, not the sync mode. If we introduce new callers that
> want to do WB_SYNC_NONE writeback from on-stack callers this will
> be nessecary.
>
> Also rename bdi_wait_on_work_clear to bdi_wait_on_work_done and inline
> a few small functions into their only caller to make the code
> understandable.
>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
>
> and seems to be fixed by
>
> commit 83ba7b071f30f7c01f72518ad72d5cd203c27502
> Author: Christoph Hellwig <hch@lst.de>
> Date: Tue Jul 6 08:59:53 2010 +0200
>
> writeback: simplify the write back thread queue
>
> First remove items from work_list as soon as we start working on them.This
> means we don't have to track any pending or visited state and can get
> rid of all the RCU magic freeing the work items - we can simply free
> them once the operation has finished. Second use a real completion for
> tracking synchronous requests - if the caller sets the completion pointer
> we complete it, otherwise use it as a boolean indicator that we can free
> the work item directly. Third unify struct wb_writeback_args and struct
> bdi_work into a single data structure, wb_writeback_work. Previous we
> set all parameters into a struct wb_writeback_args, copied it into
> struct bdi_work, copied it again on the stack to use it there. Instead
> of just allocate one structure dynamically or on the stack and use it
> all the way through the stack.
>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
>
> I was able to reproduce the bug by unpacking a big tar-file and
> deleting this files multiple times. Normally with btrfs the kernel
> crashed within 20 runs. After commit
> 83ba7b071f30f7c01f72518ad72d5cd203c27502 it survived more than 500
> runs.
Makes sense, that first commit would potentially pass in stack cruft as
the wbc arg. So I think we can safely consider it fixed now.
--
Jens Axboe
next prev parent reply other threads:[~2010-07-29 18:54 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-07-08 14:27 kernel BUG at fs/btrfs/extent-tree.c:1353 Johannes Hirte
2010-07-08 14:31 ` Chris Mason
2010-07-08 15:40 ` Johannes Hirte
2010-07-14 15:25 ` Johannes Hirte
2010-07-15 0:11 ` Dave Chinner
2010-07-15 18:14 ` Johannes Hirte
2010-07-16 14:59 ` Johannes Hirte
2010-07-19 8:01 ` Miao Xie
2010-07-22 18:07 ` Johannes Hirte
2010-07-23 11:02 ` Daniel J Blueman
2010-07-23 11:14 ` Bob Copeland
2010-07-29 17:09 ` Johannes Hirte
2010-07-29 18:54 ` Jens Axboe [this message]
2010-08-13 12:19 ` David Woodhouse
2010-07-11 12:28 ` Johannes Hirte
2010-07-13 12:23 ` Johannes Hirte
2010-07-15 18:30 ` csum errors Johannes Hirte
2010-07-15 19:03 ` Chris Mason
2010-07-15 19:32 ` Johannes Hirte
2010-07-15 19:35 ` Chris Mason
2010-07-15 20:00 ` Johannes Hirte
2010-07-17 4:55 ` Brian Rogers
2010-08-10 21:06 ` Sebastian 'gonX' Jensen
2010-08-14 7:05 ` Brian Rogers
2010-08-14 11:10 ` Sebastian 'gonX' Jensen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4C51CE82.7040901@kernel.dk \
--to=axboe@kernel.dk \
--cc=chris.mason@oracle.com \
--cc=david@fromorbit.com \
--cc=johannes.hirte@fem.tu-ilmenau.de \
--cc=linux-btrfs@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=miaox@cn.fujitsu.com \
--cc=zheng.yan@oracle.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).