From: Linus Torvalds <torvalds@osdl.org>
To: Nathan Scott <nathans@sgi.com>
Cc: Jens Axboe <axboe@suse.de>,
"Kevin P. Fleming" <kpfleming@backtobasicsmgmt.com>,
LKML <linux-kernel@vger.kernel.org>,
Linux-raid maillist <linux-raid@vger.kernel.org>,
linux-lvm@sistina.com
Subject: Re: Reproducable OOPS with MD RAID-5 on 2.6.0-test11
Date: Wed, 3 Dec 2003 09:13:22 -0800 (PST) [thread overview]
Message-ID: <Pine.LNX.4.58.0312030851180.5258@home.osdl.org> (raw)
In-Reply-To: <20031203143229.A1918624@wobbly.melbourne.sgi.com>
On Wed, 3 Dec 2003, Nathan Scott wrote:
>
> The XFS tests just tripped up a panic in raid5 in -test11 -- a kdb
> stacktrace follows. Seems to be reproducible, but not always the
> same test that causes it. And I haven't seen a double bio_put yet,
> this first problem keeps getting in the way I guess.
Ok, debugging this oops makes me _think_ that the problem comes from here:
raid5.c: around line 1000:
....
wbi = dev->written;
dev->written = NULL;
while (wbi && wbi->bi_sector < dev->sector + STRIPE_SECTORS) {
wbi2 = wbi->bi_next;
if (--wbi->bi_phys_segments == 0) {
md_write_end(conf->mddev);
wbi->bi_next = return_bi;
return_bi = wbi;
}
wbi = wbi2;
}
....
where it appears that the "wbi->bi_sector" access takes a page fault,
probably due to PAGE_ALLOC debugging. It appears that somebody has already
finished (and thus free'd) that bio.
I dunno - I can't follow what that code does at all.
One problem is that the slab code - because it caches the slabs and shares
pages between different slab entryes - will not trigger the bugs that
DEBUG_PAGEALLOC would show very easily. So here's my ugly hack once more,
to see if that makes the bug show up more repeatably and quicker. Nathan?
Linus
-+- slab-debug-on-steroids -+-
NOTE! For this patch to make sense, you have to enable the page allocator
debugging thing (CONFIG_DEBUG_PAGEALLOC), and you have to live with the
fact that it wastes a _lot_ of memory.
There's another problem with this patch: if the bug is actually in the
slab code itself, this will obviously not find it, since it disables that
code entirely.
===== mm/slab.c 1.110 vs edited =====
--- 1.110/mm/slab.c Tue Oct 21 22:10:10 2003
+++ edited/mm/slab.c Mon Dec 1 15:29:06 2003
@@ -1906,6 +1906,21 @@
static inline void * __cache_alloc (kmem_cache_t *cachep, int flags)
{
+#if 1
+ void *ptr = (void*)__get_free_pages(flags, cachep->gfporder);
+ if (ptr) {
+ struct page *page = virt_to_page(ptr);
+ SET_PAGE_CACHE(page, cachep);
+ SET_PAGE_SLAB(page, 0x01020304);
+ if (cachep->ctor) {
+ unsigned long ctor_flags = SLAB_CTOR_CONSTRUCTOR;
+ if (!(flags & __GFP_WAIT))
+ ctor_flags |= SLAB_CTOR_ATOMIC;
+ cachep->ctor(ptr, cachep, ctor_flags);
+ }
+ }
+ return ptr;
+#else
unsigned long save_flags;
void* objp;
struct array_cache *ac;
@@ -1925,6 +1940,7 @@
local_irq_restore(save_flags);
objp = cache_alloc_debugcheck_after(cachep, flags, objp, __builtin_return_address(0));
return objp;
+#endif
}
/*
@@ -2042,6 +2058,15 @@
*/
static inline void __cache_free (kmem_cache_t *cachep, void* objp)
{
+#if 1
+ {
+ struct page *page = virt_to_page(objp);
+ int order = cachep->gfporder;
+ if (cachep->dtor)
+ cachep->dtor(objp, cachep, 0);
+ __free_pages(page, order);
+ }
+#else
struct array_cache *ac = ac_data(cachep);
check_irq_off();
@@ -2056,6 +2081,7 @@
cache_flusharray(cachep, ac);
ac_entry(ac)[ac->avail++] = objp;
}
+#endif
}
/**
next prev parent reply other threads:[~2003-12-03 17:13 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2003-12-01 14:06 Reproducable OOPS with MD RAID-5 on 2.6.0-test11 Kevin P. Fleming
2003-12-01 14:11 ` Jens Axboe
2003-12-01 14:15 ` Kevin P. Fleming
2003-12-01 15:51 ` Jens Axboe
2003-12-02 4:02 ` Kevin P. Fleming
2003-12-02 4:15 ` Mike Fedyk
2003-12-02 13:11 ` Kevin P. Fleming
2003-12-02 8:27 ` Jens Axboe
2003-12-02 10:10 ` Nathan Scott
2003-12-02 13:15 ` Kevin P. Fleming
2003-12-03 3:32 ` Nathan Scott
2003-12-03 17:13 ` Linus Torvalds [this message]
2003-12-02 18:23 ` Linus Torvalds
2003-12-04 1:12 ` Simon Kirby
2003-12-04 1:23 ` Linus Torvalds
2003-12-04 4:31 ` Simon Kirby
2003-12-05 6:55 ` Theodore Ts'o
2003-12-04 20:53 ` Herbert Xu
2003-12-04 21:06 ` Linus Torvalds
2003-12-01 23:06 ` Reproducable OOPS with MD RAID-5 on 2.6.0-test11 - with XFS Neil Brown
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Pine.LNX.4.58.0312030851180.5258@home.osdl.org \
--to=torvalds@osdl.org \
--cc=axboe@suse.de \
--cc=kpfleming@backtobasicsmgmt.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-lvm@sistina.com \
--cc=linux-raid@vger.kernel.org \
--cc=nathans@sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).