All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] Remove softlockup from invalidate_mapping_pages.
       [not found] <20060420160549.7637.patches@notabene>
@ 2006-04-20  6:29 ` NeilBrown
  2006-04-20  7:38   ` Andrew Morton
  0 siblings, 1 reply; 27+ messages in thread
From: NeilBrown @ 2006-04-20  6:29 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, Steinar H. Gunderson

The following patch fixes a problem with invalidate_mapping_pages.
Please look at the patch description and then come back here, because
there are some things I don't understand which you might be able to
help me with.
...

Thanks. 
I have had 2 reports of softlockups in this code apparently related to md. 
md calls invalidate_bdev on all the component block devices that it is 
building into an array.  These block devices are very likely to have just one
page in their mapping, right near the end as mdadm will have read the 
superblock which lives near the end.

However, I cannot see why the page would be locked.
Being locked for read is very unlikely because mdadm would have already
read the superblock.  I guess locked for read-ahead might be possible
(I assume readahead does lock the pages) but as only one or maybe two reads 
are a performed by mdadm, not much readahead should be generated.

Being locked for write also seems unlikely as if mdadm were to write
(which is fairly unlikely but not impossible) it would fsync() straight
away so by the time that it comes to assemble the array, all io
should have finished.

So that is (1) - I don't see why the page would be locked.

And (2) - I have a report (on linux-raid) of a soft-lockup which
lasted 76 seconds! 
Now if the device was 100Gig, that is 25million page addresses or 
3microseconds per loop. Is that at all likely for this loop - it 
does take and drop a spinlock but could that come close to a
few thousand cycles?

And the processor in this case was a dual-core amd64 - with SMP enabled.
I can imaging a long lockup on a uniprocessor, but if a second processor
core is free to unlock the page when the IO (Whatever it is) completes,
a 76 second delay would be unexpected.

The original bug report can be found at
  http://marc.theaimsgroup.com/?l=linux-raid&m=114550096908177&w=2

Finally (3) - I think that invalidate_mapping_pages should probably
have a cond_resched() call in it, except that drop_pagecache_sb in
fs/drop_caches.c calls it with the "inode_lock" spinlock held, which
would be bad.  Would it be safe (or could it be made safe) to drop and
regain the lock around that call?

Comments welcome, but in any case I think the patch is needed.

Thanks for your time,
NeilBrown




### Comments for Changeset

If invalidate_mapping_pages is called to invalidate a very large
mapping (e.g. a very large block device) and if the only active page
in that device is near the end  (or at least, at a very large  index),
such as, say, the superblock of an md array, and if that page
happens to be locked when invalidate_mapping_pages is called,
then
  pagevec_lookup will return this page and
  as it is locked, 'next' will be incremented and pagevec_lookup
  will be called again. and again. and again.
  while we count from 0 upto a very large number.

We should really always set 'next' to 'page->index+1' before going
around the loop again, not just if the page isn't locked.


Cc: "Steinar H. Gunderson" <sgunderson@bigfoot.com>
Signed-off-by: Neil Brown <neilb@suse.de>

### Diffstat output
 ./mm/truncate.c |   10 ++++------
 1 file changed, 4 insertions(+), 6 deletions(-)

diff ./mm/truncate.c~current~ ./mm/truncate.c
--- ./mm/truncate.c~current~	2006-04-20 15:27:22.000000000 +1000
+++ ./mm/truncate.c	2006-04-20 15:38:20.000000000 +1000
@@ -238,13 +238,11 @@ unsigned long invalidate_mapping_pages(s
 		for (i = 0; i < pagevec_count(&pvec); i++) {
 			struct page *page = pvec.pages[i];
 
-			if (TestSetPageLocked(page)) {
-				next++;
+			next = page->index+1;
+
+			if (TestSetPageLocked(page))
 				continue;
-			}
-			if (page->index > next)
-				next = page->index;
-			next++;
+
 			if (PageDirty(page) || PageWriteback(page))
 				goto unlock;
 			if (page_mapped(page))

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] Remove softlockup from invalidate_mapping_pages.
  2006-04-20  6:29 ` [PATCH] Remove softlockup from invalidate_mapping_pages NeilBrown
@ 2006-04-20  7:38   ` Andrew Morton
  2006-04-20  9:24     ` Neil Brown
  2006-04-26 20:48     ` Steinar H. Gunderson
  0 siblings, 2 replies; 27+ messages in thread
From: Andrew Morton @ 2006-04-20  7:38 UTC (permalink / raw)
  To: NeilBrown; +Cc: linux-kernel, sgunderson

NeilBrown <neilb@suse.de> wrote:
>
> The following patch fixes a problem with invalidate_mapping_pages.
> Please look at the patch description and then come back here, because
> there are some things I don't understand which you might be able to
> help me with.
> ...
> 
> Thanks. 
> I have had 2 reports of softlockups in this code apparently related to md. 
> md calls invalidate_bdev on all the component block devices that it is 
> building into an array.  These block devices are very likely to have just one
> page in their mapping, right near the end as mdadm will have read the 
> superblock which lives near the end.
> 
> However, I cannot see why the page would be locked.
> Being locked for read is very unlikely because mdadm would have already
> read the superblock.  I guess locked for read-ahead might be possible
> (I assume readahead does lock the pages) but as only one or maybe two reads 
> are a performed by mdadm, not much readahead should be generated.
> 
> Being locked for write also seems unlikely as if mdadm were to write
> (which is fairly unlikely but not impossible) it would fsync() straight
> away so by the time that it comes to assemble the array, all io
> should have finished.
> 
> So that is (1) - I don't see why the page would be locked.

Dunno.   memory reclaim could lock the page, but that'd be pretty rare.

> And (2) - I have a report (on linux-raid) of a soft-lockup which
> lasted 76 seconds! 
> Now if the device was 100Gig, that is 25million page addresses or 
> 3microseconds per loop. Is that at all likely for this loop - it 
> does take and drop a spinlock but could that come close to a
> few thousand cycles?

Maybe this CPU is holding some lock which pervents the page-unlocking thread
from unlocking the page?

You could perhaps do

	page->who_locked_me = __builtin_return_address(<something>);

in lock_page().  And in TestSetPageLocked().  It'd take some poking. 
Alternatively a well-timed sysrq-T might tell you.


As you point out, the loop will count up to page->index one-at-a-time.  3us
sounds rather heavy, but the radix-tree gang-lookup code isn't terribly
efficient, and the tree will be deep.

> And the processor in this case was a dual-core amd64 - with SMP enabled.
> I can imaging a long lockup on a uniprocessor, but if a second processor
> core is free to unlock the page when the IO (Whatever it is) completes,
> a 76 second delay would be unexpected.

yup.

> The original bug report can be found at
>   http://marc.theaimsgroup.com/?l=linux-raid&m=114550096908177&w=2
> 
> Finally (3) - I think that invalidate_mapping_pages should probably
> have a cond_resched() call in it, except that drop_pagecache_sb in
> fs/drop_caches.c calls it with the "inode_lock" spinlock held, which
> would be bad.

Yes, we used to have a cond_resched() in there, but I took it out, for that
reason.  I'm bad.

  Would it be safe (or could it be made safe) to drop and
> regain the lock around that call?

Hard.  We're walking a long list-head chain which is pinned by that lock.

> If invalidate_mapping_pages is called to invalidate a very large
> mapping (e.g. a very large block device) and if the only active page
> in that device is near the end  (or at least, at a very large  index),
> such as, say, the superblock of an md array, and if that page
> happens to be locked when invalidate_mapping_pages is called,
> then
>   pagevec_lookup will return this page and
>   as it is locked, 'next' will be incremented and pagevec_lookup
>   will be called again. and again. and again.
>   while we count from 0 upto a very large number.
> 
> We should really always set 'next' to 'page->index+1' before going
> around the loop again, not just if the page isn't locked.
> 

OK.

> Cc: "Steinar H. Gunderson" <sgunderson@bigfoot.com>
> Signed-off-by: Neil Brown <neilb@suse.de>
> 
> ### Diffstat output
>  ./mm/truncate.c |   10 ++++------
>  1 file changed, 4 insertions(+), 6 deletions(-)
> 
> diff ./mm/truncate.c~current~ ./mm/truncate.c
> --- ./mm/truncate.c~current~	2006-04-20 15:27:22.000000000 +1000
> +++ ./mm/truncate.c	2006-04-20 15:38:20.000000000 +1000
> @@ -238,13 +238,11 @@ unsigned long invalidate_mapping_pages(s
>  		for (i = 0; i < pagevec_count(&pvec); i++) {
>  			struct page *page = pvec.pages[i];
>  
> -			if (TestSetPageLocked(page)) {
> -				next++;
> +			next = page->index+1;
> +
> +			if (TestSetPageLocked(page))
>  				continue;
> -			}
> -			if (page->index > next)
> -				next = page->index;
> -			next++;
> +
>  			if (PageDirty(page) || PageWriteback(page))
>  				goto unlock;
>  			if (page_mapped(page))

We're not supposed to look at page->index of an unlocked page.

In practice, I think it's OK - there's no _reason_ why anyone would want to
trash the ->index of a just-truncated page.  However I think it'd be saner
to a) only look at ->index after we've tried to lock the page and b) make
sure that ->index is really "to the right" of where we're currently at.

How's this look?

--- devel/mm/truncate.c~remove-softlockup-from-invalidate_mapping_pages	2006-04-20 00:20:49.000000000 -0700
+++ devel-akpm/mm/truncate.c	2006-04-20 00:28:18.000000000 -0700
@@ -230,14 +230,24 @@ unsigned long invalidate_mapping_pages(s
 			pagevec_lookup(&pvec, mapping, next, PAGEVEC_SIZE)) {
 		for (i = 0; i < pagevec_count(&pvec); i++) {
 			struct page *page = pvec.pages[i];
+			pgoff_t index;
+			int locked;
 
-			if (TestSetPageLocked(page)) {
-				next++;
-				continue;
-			}
-			if (page->index > next)
-				next = page->index;
+			locked = TestSetPageLocked(page);
+
+			/*
+			 * We really shouldn't be looking at the ->index of an
+			 * unlocked page.  But we're not allowed to lock these
+			 * pages.  So we rely upon nobody altering the ->index
+			 * of this (pinned-by-us) page.
+			 */
+			index = page->index;
+			if (index > next)
+				next = index;
 			next++;
+			if (!locked)
+				continue;
+
 			if (PageDirty(page) || PageWriteback(page))
 				goto unlock;
 			if (page_mapped(page))
_



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] Remove softlockup from invalidate_mapping_pages.
  2006-04-20  7:38   ` Andrew Morton
@ 2006-04-20  9:24     ` Neil Brown
  2006-04-20  9:30       ` Jens Axboe
  2006-04-20  9:52       ` Andrew Morton
  2006-04-26 20:48     ` Steinar H. Gunderson
  1 sibling, 2 replies; 27+ messages in thread
From: Neil Brown @ 2006-04-20  9:24 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Jens Axboe, linux-kernel, sgunderson

On Thursday April 20, akpm@osdl.org wrote:
> > Cc: "Steinar H. Gunderson" <sgunderson@bigfoot.com>
> > Signed-off-by: Neil Brown <neilb@suse.de>
> > 
> > ### Diffstat output
> >  ./mm/truncate.c |   10 ++++------
> >  1 file changed, 4 insertions(+), 6 deletions(-)
> > 
> > diff ./mm/truncate.c~current~ ./mm/truncate.c
> > --- ./mm/truncate.c~current~	2006-04-20 15:27:22.000000000 +1000
> > +++ ./mm/truncate.c	2006-04-20 15:38:20.000000000 +1000
> > @@ -238,13 +238,11 @@ unsigned long invalidate_mapping_pages(s
> >  		for (i = 0; i < pagevec_count(&pvec); i++) {
> >  			struct page *page = pvec.pages[i];
> >  
> > -			if (TestSetPageLocked(page)) {
> > -				next++;
> > +			next = page->index+1;
> > +
> > +			if (TestSetPageLocked(page))
> >  				continue;
> > -			}
> > -			if (page->index > next)
> > -				next = page->index;
> > -			next++;
> > +
> >  			if (PageDirty(page) || PageWriteback(page))
> >  				goto unlock;
> >  			if (page_mapped(page))
> 
> We're not supposed to look at page->index of an unlocked page.

We're not?  Does Jens know that?
__generic_file_splice_read seems to violate this principle!

Are you allowed to look at ->mapping? Or can that change magically
too?  What's the threat-model?  Is it splice(), or something more wicked?

> 
> In practice, I think it's OK - there's no _reason_ why anyone would want to
> trash the ->index of a just-truncated page.  However I think it'd be saner
> to a) only look at ->index after we've tried to lock the page and b) make
> sure that ->index is really "to the right" of where we're currently at.
> 
> How's this look?
> 

Uhmm... possibly OK, but I think I'd rather change find_get_pages to
take an index pointer like find_get_pages_tag does, and do the thing
safely.  However that started turning into a big patch (reiserfs calls
find_get_pages directly a few times, and I hadn't even got up to
callers of pagevec_lookup....

So when I have a cleared head and a bit more time I'll see if I can
come up with a better patch which only looks at ->index under
->tree_lock.

Thanks for the feedback,

NeilBrown


> --- devel/mm/truncate.c~remove-softlockup-from-invalidate_mapping_pages	2006-04-20 00:20:49.000000000 -0700
> +++ devel-akpm/mm/truncate.c	2006-04-20 00:28:18.000000000 -0700
> @@ -230,14 +230,24 @@ unsigned long invalidate_mapping_pages(s
>  			pagevec_lookup(&pvec, mapping, next, PAGEVEC_SIZE)) {
>  		for (i = 0; i < pagevec_count(&pvec); i++) {
>  			struct page *page = pvec.pages[i];
> +			pgoff_t index;
> +			int locked;
>  
> -			if (TestSetPageLocked(page)) {
> -				next++;
> -				continue;
> -			}
> -			if (page->index > next)
> -				next = page->index;
> +			locked = TestSetPageLocked(page);
> +
> +			/*
> +			 * We really shouldn't be looking at the ->index of an
> +			 * unlocked page.  But we're not allowed to lock these
> +			 * pages.  So we rely upon nobody altering the ->index
> +			 * of this (pinned-by-us) page.
> +			 */
> +			index = page->index;
> +			if (index > next)
> +				next = index;
>  			next++;
> +			if (!locked)
> +				continue;
> +
>  			if (PageDirty(page) || PageWriteback(page))
>  				goto unlock;
>  			if (page_mapped(page))
> _
> 
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] Remove softlockup from invalidate_mapping_pages.
  2006-04-20  9:24     ` Neil Brown
@ 2006-04-20  9:30       ` Jens Axboe
  2006-04-20  9:50         ` Neil Brown
  2006-04-20  9:52       ` Andrew Morton
  1 sibling, 1 reply; 27+ messages in thread
From: Jens Axboe @ 2006-04-20  9:30 UTC (permalink / raw)
  To: Neil Brown; +Cc: Andrew Morton, linux-kernel, sgunderson

On Thu, Apr 20 2006, Neil Brown wrote:
> On Thursday April 20, akpm@osdl.org wrote:
> > > Cc: "Steinar H. Gunderson" <sgunderson@bigfoot.com>
> > > Signed-off-by: Neil Brown <neilb@suse.de>
> > > 
> > > ### Diffstat output
> > >  ./mm/truncate.c |   10 ++++------
> > >  1 file changed, 4 insertions(+), 6 deletions(-)
> > > 
> > > diff ./mm/truncate.c~current~ ./mm/truncate.c
> > > --- ./mm/truncate.c~current~	2006-04-20 15:27:22.000000000 +1000
> > > +++ ./mm/truncate.c	2006-04-20 15:38:20.000000000 +1000
> > > @@ -238,13 +238,11 @@ unsigned long invalidate_mapping_pages(s
> > >  		for (i = 0; i < pagevec_count(&pvec); i++) {
> > >  			struct page *page = pvec.pages[i];
> > >  
> > > -			if (TestSetPageLocked(page)) {
> > > -				next++;
> > > +			next = page->index+1;
> > > +
> > > +			if (TestSetPageLocked(page))
> > >  				continue;
> > > -			}
> > > -			if (page->index > next)
> > > -				next = page->index;
> > > -			next++;
> > > +
> > >  			if (PageDirty(page) || PageWriteback(page))
> > >  				goto unlock;
> > >  			if (page_mapped(page))
> > 
> > We're not supposed to look at page->index of an unlocked page.
> 
> We're not?  Does Jens know that?
> __generic_file_splice_read seems to violate this principle!

Hmmm, where?


-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] Remove softlockup from invalidate_mapping_pages.
  2006-04-20  9:30       ` Jens Axboe
@ 2006-04-20  9:50         ` Neil Brown
  2006-04-20  9:59           ` Jens Axboe
  0 siblings, 1 reply; 27+ messages in thread
From: Neil Brown @ 2006-04-20  9:50 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Andrew Morton, linux-kernel, sgunderson

On Thursday April 20, axboe@suse.de wrote:
> > > 
> > > We're not supposed to look at page->index of an unlocked page.
> > 
> > We're not?  Does Jens know that?
> > __generic_file_splice_read seems to violate this principle!
> 
> Hmmm, where?
> 
2.6.17-rc1-mm1
fs/splice.c

static int __generic_file_splice_read(struct file *in, struct inode *pipe,
				      size_t len, unsigned int flags)
{
...snip....
	do_page_cache_readahead(mapping, in, index, nr_pages);

	/*
	 * Get as many pages from the page cache as possible..
	 * Start IO on the page cache entries we create (we
	 * can assume that any pre-existing ones we find have
	 * already had IO started on them).
	 */
	i = find_get_pages(mapping, index, nr_pages, pages);

	/*
	 * common case - we found all pages and they are contiguous,
	 * kick them off
	 */
	if (i && (pages[i - 1]->index == index + i - 1))
                  ^^^^^^^^^^^^^^^^^^^^
***There and
		goto splice_them;

	/*
	 * fill shadow[] with pages at the right locations, so we only
	 * have to fill holes
	 */
	memset(shadow, 0, nr_pages * sizeof(struct page *));
	for (j = 0; j < i; j++)
		shadow[pages[j]->index - index] = pages[j];
                      ^^^^^^^^^^^^^^^^

***There

NeilBrown

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] Remove softlockup from invalidate_mapping_pages.
  2006-04-20  9:24     ` Neil Brown
  2006-04-20  9:30       ` Jens Axboe
@ 2006-04-20  9:52       ` Andrew Morton
  2006-04-20 10:52         ` Jens Axboe
  1 sibling, 1 reply; 27+ messages in thread
From: Andrew Morton @ 2006-04-20  9:52 UTC (permalink / raw)
  To: Neil Brown; +Cc: axboe, linux-kernel, sgunderson

Neil Brown <neilb@suse.de> wrote:
>
> On Thursday April 20, akpm@osdl.org wrote:
> > > Cc: "Steinar H. Gunderson" <sgunderson@bigfoot.com>
> > > Signed-off-by: Neil Brown <neilb@suse.de>
> > > 
> > > ### Diffstat output
> > >  ./mm/truncate.c |   10 ++++------
> > >  1 file changed, 4 insertions(+), 6 deletions(-)
> > > 
> > > diff ./mm/truncate.c~current~ ./mm/truncate.c
> > > --- ./mm/truncate.c~current~	2006-04-20 15:27:22.000000000 +1000
> > > +++ ./mm/truncate.c	2006-04-20 15:38:20.000000000 +1000
> > > @@ -238,13 +238,11 @@ unsigned long invalidate_mapping_pages(s
> > >  		for (i = 0; i < pagevec_count(&pvec); i++) {
> > >  			struct page *page = pvec.pages[i];
> > >  
> > > -			if (TestSetPageLocked(page)) {
> > > -				next++;
> > > +			next = page->index+1;
> > > +
> > > +			if (TestSetPageLocked(page))
> > >  				continue;
> > > -			}
> > > -			if (page->index > next)
> > > -				next = page->index;
> > > -			next++;
> > > +
> > >  			if (PageDirty(page) || PageWriteback(page))
> > >  				goto unlock;
> > >  			if (page_mapped(page))
> > 
> > We're not supposed to look at page->index of an unlocked page.
> 
> We're not?

We've avoided it.  But I think we could change the rules.

>  Does Jens know that?
> __generic_file_splice_read seems to violate this principle!

It looks OK from a quick read (but the code duplication is saddening)

> Are you allowed to look at ->mapping? Or can that change magically
> too?

No.  ->mapping can be set to NULL by truncate.   That's why everyone does

	get_page(page);
	lock_page(page);
	if (page->mapping == NULL)
		someone_just_truncated_this_page();

But truncate doesn't alter ->index.  And I think we can assume that.

>  What's the threat-model?  Is it splice(), or something more wicked?

truncate.

> > 
> > In practice, I think it's OK - there's no _reason_ why anyone would want to
> > trash the ->index of a just-truncated page.  However I think it'd be saner
> > to a) only look at ->index after we've tried to lock the page and b) make
> > sure that ->index is really "to the right" of where we're currently at.
> > 
> > How's this look?
> > 
> 
> Uhmm... possibly OK, but I think I'd rather change find_get_pages to
> take an index pointer like find_get_pages_tag does, and do the thing
> safely.  However that started turning into a big patch (reiserfs calls
> find_get_pages directly a few times, and I hadn't even got up to
> callers of pagevec_lookup....

Yes, that could get involved.

> So when I have a cleared head and a bit more time I'll see if I can
> come up with a better patch which only looks at ->index under
> ->tree_lock.

tree_lock will stabilise ->index, yes.

But I think we'd be OK assuming that ->index is stable.  Although that may
break if splice() is concurrently pulling the page out of pagecache and
stuffing it into a pipe.  


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] Remove softlockup from invalidate_mapping_pages.
  2006-04-20  9:50         ` Neil Brown
@ 2006-04-20  9:59           ` Jens Axboe
  2006-04-20 10:26             ` Neil Brown
  0 siblings, 1 reply; 27+ messages in thread
From: Jens Axboe @ 2006-04-20  9:59 UTC (permalink / raw)
  To: Neil Brown; +Cc: Andrew Morton, linux-kernel, sgunderson

On Thu, Apr 20 2006, Neil Brown wrote:
> On Thursday April 20, axboe@suse.de wrote:
> > > > 
> > > > We're not supposed to look at page->index of an unlocked page.
> > > 
> > > We're not?  Does Jens know that?
> > > __generic_file_splice_read seems to violate this principle!
> > 
> > Hmmm, where?
> > 
> 2.6.17-rc1-mm1
> fs/splice.c
> 
> static int __generic_file_splice_read(struct file *in, struct inode *pipe,
> 				      size_t len, unsigned int flags)
> {
> ...snip....
> 	do_page_cache_readahead(mapping, in, index, nr_pages);
> 
> 	/*
> 	 * Get as many pages from the page cache as possible..
> 	 * Start IO on the page cache entries we create (we
> 	 * can assume that any pre-existing ones we find have
> 	 * already had IO started on them).
> 	 */
> 	i = find_get_pages(mapping, index, nr_pages, pages);
> 
> 	/*
> 	 * common case - we found all pages and they are contiguous,
> 	 * kick them off
> 	 */
> 	if (i && (pages[i - 1]->index == index + i - 1))
>                   ^^^^^^^^^^^^^^^^^^^^
> ***There and
> 		goto splice_them;
> 
> 	/*
> 	 * fill shadow[] with pages at the right locations, so we only
> 	 * have to fill holes
> 	 */
> 	memset(shadow, 0, nr_pages * sizeof(struct page *));
> 	for (j = 0; j < i; j++)
> 		shadow[pages[j]->index - index] = pages[j];
>                       ^^^^^^^^^^^^^^^^
> 
> ***There

That code is ancient :-). See -rc2.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] Remove softlockup from invalidate_mapping_pages.
  2006-04-20  9:59           ` Jens Axboe
@ 2006-04-20 10:26             ` Neil Brown
  0 siblings, 0 replies; 27+ messages in thread
From: Neil Brown @ 2006-04-20 10:26 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Andrew Morton, linux-kernel, sgunderson

On Thursday April 20, axboe@suse.de wrote:
> 
> That code is ancient :-). See -rc2.

Ahhh...  golly, you're right.  It's nearly two weeks old now.  It's
hard to get the hang of Internet-time sometimes :-)
Sorry for the noise.

NeilBrown

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] Remove softlockup from invalidate_mapping_pages.
  2006-04-20  9:52       ` Andrew Morton
@ 2006-04-20 10:52         ` Jens Axboe
  0 siblings, 0 replies; 27+ messages in thread
From: Jens Axboe @ 2006-04-20 10:52 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Neil Brown, linux-kernel, sgunderson

On Thu, Apr 20 2006, Andrew Morton wrote:
> >  Does Jens know that?
> > __generic_file_splice_read seems to violate this principle!
> 
> It looks OK from a quick read (but the code duplication is saddening)

Yes, it saddens me too. There really are a bunch of cases to check for
on each page that is shared with do_generic_mapping_read(), I'll see if
I can do something about that.

> > So when I have a cleared head and a bit more time I'll see if I can
> > come up with a better patch which only looks at ->index under
> > ->tree_lock.
> 
> tree_lock will stabilise ->index, yes.
> 
> But I think we'd be OK assuming that ->index is stable.  Although that may
> break if splice() is concurrently pulling the page out of pagecache and
> stuffing it into a pipe.  

Putting the page in the pipe is a simple reference operations. However,
we can migrate a page from a pipe into the page cache in another address
space. If we do that, we will lock the page first though. And the actual
->index change is inside the tree_lock, outside of splice itself.

So with splice migrating pages, ->index and ->mapping can and will
change but only with the page locked.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] Remove softlockup from invalidate_mapping_pages.
  2006-04-20  7:38   ` Andrew Morton
  2006-04-20  9:24     ` Neil Brown
@ 2006-04-26 20:48     ` Steinar H. Gunderson
  2006-04-26 20:58       ` Andrew Morton
  1 sibling, 1 reply; 27+ messages in thread
From: Steinar H. Gunderson @ 2006-04-26 20:48 UTC (permalink / raw)
  To: Andrew Morton; +Cc: NeilBrown, linux-kernel

On Thu, Apr 20, 2006 at 12:38:39AM -0700, Andrew Morton wrote:
> In practice, I think it's OK - there's no _reason_ why anyone would want to
> trash the ->index of a just-truncated page.  However I think it'd be saner
> to a) only look at ->index after we've tried to lock the page and b) make
> sure that ->index is really "to the right" of where we're currently at.
> 
> How's this look?
> 
> --- devel/mm/truncate.c~remove-softlockup-from-invalidate_mapping_pages	2006-04-20 00:20:49.000000000 -0700
> +++ devel-akpm/mm/truncate.c	2006-04-20 00:28:18.000000000 -0700

I tried this patch against 2.6.17-rc2 (I hoped that it might be fixing my
kswapd oopses too, as they seem related; see
http://lkml.org/lkml/2006/4/26/124 and followups), and it simply makes my
machine hang on bootup -- it seems to make modprobe hang forever on some lock
or something right after it loads raid6.ko (pulled in by evms_activate) in
initramfs. Without the patch, the machine boots just fine.

Since it's on serial console, I've got a SysRq+T log if it helps you.

/* Steinar */
-- 
Homepage: http://www.sesse.net/

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] Remove softlockup from invalidate_mapping_pages.
  2006-04-26 20:48     ` Steinar H. Gunderson
@ 2006-04-26 20:58       ` Andrew Morton
  2006-04-26 21:26         ` Steinar H. Gunderson
  2006-05-13 13:49         ` [PATCH] Remove softlockup from invalidate_mapping_pages. (might be dm related) Steinar H. Gunderson
  0 siblings, 2 replies; 27+ messages in thread
From: Andrew Morton @ 2006-04-26 20:58 UTC (permalink / raw)
  To: Steinar H. Gunderson; +Cc: neilb, linux-kernel

"Steinar H. Gunderson" <sgunderson@bigfoot.com> wrote:
>
> I tried this patch against 2.6.17-rc2 (I hoped that it might be fixing my
> kswapd oopses too, as they seem related; see
> http://lkml.org/lkml/2006/4/26/124 and followups), and it simply makes my
> machine hang on bootup -- it seems to make modprobe hang forever on some lock
> or something right after it loads raid6.ko (pulled in by evms_activate) in
> initramfs. Without the patch, the machine boots just fine.

It had a silly bug.  Fixed version:


diff -puN mm/truncate.c~remove-softlockup-from-invalidate_mapping_pages mm/truncate.c
--- devel/mm/truncate.c~remove-softlockup-from-invalidate_mapping_pages	2006-04-21 19:45:11.000000000 -0700
+++ devel-akpm/mm/truncate.c	2006-04-21 19:46:14.000000000 -0700
@@ -230,14 +230,24 @@ unsigned long invalidate_mapping_pages(s
 			pagevec_lookup(&pvec, mapping, next, PAGEVEC_SIZE)) {
 		for (i = 0; i < pagevec_count(&pvec); i++) {
 			struct page *page = pvec.pages[i];
+			pgoff_t index;
+			int lock_failed;
 
-			if (TestSetPageLocked(page)) {
-				next++;
-				continue;
-			}
-			if (page->index > next)
-				next = page->index;
+			lock_failed = TestSetPageLocked(page);
+
+			/*
+			 * We really shouldn't be looking at the ->index of an
+			 * unlocked page.  But we're not allowed to lock these
+			 * pages.  So we rely upon nobody altering the ->index
+			 * of this (pinned-by-us) page.
+			 */
+			index = page->index;
+			if (index > next)
+				next = index;
 			next++;
+			if (lock_failed)
+				continue;
+
 			if (PageDirty(page) || PageWriteback(page))
 				goto unlock;
 			if (page_mapped(page))
_


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] Remove softlockup from invalidate_mapping_pages.
  2006-04-26 20:58       ` Andrew Morton
@ 2006-04-26 21:26         ` Steinar H. Gunderson
  2006-05-01 16:07           ` Steinar H. Gunderson
  2006-05-13 13:49         ` [PATCH] Remove softlockup from invalidate_mapping_pages. (might be dm related) Steinar H. Gunderson
  1 sibling, 1 reply; 27+ messages in thread
From: Steinar H. Gunderson @ 2006-04-26 21:26 UTC (permalink / raw)
  To: Andrew Morton; +Cc: neilb, linux-kernel

On Wed, Apr 26, 2006 at 01:58:09PM -0700, Andrew Morton wrote:
> It had a silly bug.  Fixed version:

Thanks, that boots. Of course, since I've only seen the RAID-5 hang bug
once ever, I've got no idea whether it actually fixes it, but I guess that if
the machine is stable for a week or so, it actually fixes my kswapd problem
too :-)

Oddly enough, I've seen other oopses on 2.6 with amd64 and software RAID
(http://blog.madduck.net/geek/2006.01.07-kernel-oops.html) involving
invalidate_mapping_page somehow. I wonder if there's a trend here somewhere
or if I'm just into conspiracy theories...

/* Steinar */
-- 
Homepage: http://www.sesse.net/

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] Remove softlockup from invalidate_mapping_pages.
  2006-04-26 21:26         ` Steinar H. Gunderson
@ 2006-05-01 16:07           ` Steinar H. Gunderson
  0 siblings, 0 replies; 27+ messages in thread
From: Steinar H. Gunderson @ 2006-05-01 16:07 UTC (permalink / raw)
  To: Andrew Morton; +Cc: neilb, linux-kernel

On Wed, Apr 26, 2006 at 11:26:59PM +0200, Steinar H. Gunderson wrote:
>> It had a silly bug.  Fixed version:
> Thanks, that boots. Of course, since I've only seen the RAID-5 hang bug
> once ever, I've got no idea whether it actually fixes it, but I guess that if
> the machine is stable for a week or so, it actually fixes my kswapd problem
> too :-)

pannekake:~> uptime
 18:07:08 up 4 days, 18:51,  1 user,  load average: 0.28, 0.18, 0.11

Wow. I'm starting to believe this patch actually helped... :-)

/* Steinar */
-- 
Homepage: http://www.sesse.net/

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] Remove softlockup from invalidate_mapping_pages. (might be dm related)
  2006-04-26 20:58       ` Andrew Morton
  2006-04-26 21:26         ` Steinar H. Gunderson
@ 2006-05-13 13:49         ` Steinar H. Gunderson
  2006-05-13 14:33           ` Andrew Morton
  1 sibling, 1 reply; 27+ messages in thread
From: Steinar H. Gunderson @ 2006-05-13 13:49 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel

On Wed, Apr 26, 2006 at 01:58:09PM -0700, Andrew Morton wrote:
>> I tried this patch against 2.6.17-rc2 (I hoped that it might be fixing my
>> kswapd oopses too, as they seem related; see
>> http://lkml.org/lkml/2006/4/26/124 and followups), and it simply makes my
>> machine hang on bootup -- it seems to make modprobe hang forever on some lock
>> or something right after it loads raid6.ko (pulled in by evms_activate) in
>> initramfs. Without the patch, the machine boots just fine.
> It had a silly bug.  Fixed version:

That worked fine for a long time (>14 days), but I had to switch motherboards
(from a cheap Epox to a Tyan server-board) due to external factors. Since
then, stuff has started panicing again wildly -- 2.6.17-rc2 doesn't even
boot, 2.6.17-rc3 and 2.6.17-rc4 lives for an hour or so and then gives up:

[ 1127.842645] Unable to handle kernel NULL pointer dereference at 0000000000000040 RIP: 
[ 1127.848117] <ffffffff803a1ae8>{__lock_text_start+0}
[ 1127.855474] PGD 5e38a067 PUD 5e39d067 PMD 0 
[ 1127.859770] Oops: 0002 [1] SMP 
[ 1127.862931] CPU 1 
[ 1127.864957] Modules linked in: w83627hf_wdt eeprom ide_generic ide_disk ide_cd cdrom ipv6 psmouse i2c_nforce2 serio_raw pcspkr i2c_core parport_pc parport rtc e
xt3 jbd mbcahe raid6 raid5 xor raid10 raid1 raid0 linear mdd dm_mod sd_mod sata_nv sata_sil libata scsi_mod amd74xx generic forcedeth tg3 ide_core ohci_hcd ehci_hc
d thermal processor fan unix
[ 1127.896622] Pid: 0, comm: swapper Not tainted 2.6.17-rc4 #1
[ 1127.902191] RIP: 0010:[<ffffffff803a1ae8>] <ffffffff803a1ae8>{__lock_text_start+0}
[ 1127.909589] RSP: 0018:ffff81000245bd70  EFLAGS: 00010002
[ 1127.915094] RAX: 0000000000005d09 RBX: 0000000000005d09 RCX: ffff8100020446a8
[ 1127.922221] RDX: ffff81007e2ee800 RSI: ffff810002044648 RDI: 0000000000000040
[ 1127.929346] RBP: 0000000000005d09 R08: 0000000000000000 R09: 0000000000000000
[ 1127.936472] R10: 0000000000000001 R11: ffffffff8024c868 R12: ffff81007ddb41c0
[ 1127.943599] R13: 0000000000000296 R14: ffff81007caaf650 R15: 0000000000000000
[ 1127.950726] FS:  0000000000000000(0000) GS:ffff81007f827840(0000) knlGS:00000000f7d666c0
[ 1127.958819] CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
[ 1127.964559] CR2: 0000000000000040 CR3: 0000000062aec000 CR4: 00000000000006e0
[ 1127.971686] Process swapper (pid: 0, threadinfo ffff810002452000, task ffff810002444080)
[ 1127.979777] Stack: ffffffff802668fb ffff81007caaf650 ffffffff880d3a29 ffff81007caaf650 
[ 1127.987634]        0000000000000000 ffff81007cf689f0 ffff81003d41d240 0000000000000000 
[ 1127.995684]        ffffffff880d3b5f ffff81007e3530e8 
[ 1128.000757] Call Trace: <IRQ> <ffffffff802668fb>{kmem_cache_free+253}
[ 1128.007231]        <ffffffff880d3a29>{:dm_mod:dec_pending+169} <ffffffff880d3b5f>{:dm_mod:clone_endio+127}
[ 1128.016932]        <ffffffff802c9372>{__end_that_request_first+420} <ffffffff8808e8a6>{:scsi_mod:scsi_end_request+40}
[ 1128.027589]        <ffffffff8808eb51>{:scsi_mod:scsi_io_completion+522}
[ 1128.034222]        <ffffffff880cc4a1>{:sd_mod:sd_rw_intr+623} <ffffffff8808f5d6>{:scsi_mod:scsi_device_unbusy+85}
[ 1128.044527]        <ffffffff802c86cb>{blk_done_softirq+113} <ffffffff8022c41b>{__do_softirq+86}
[ 1128.053265]        <ffffffff8020a742>{call_softirq+30} <ffffffff8020b902>{do_softirq+44}
[ 1128.061396]        <ffffffff8020b947>{do_IRQ+65} <ffffffff8020827b>{default_idle+0}
[ 1128.069091]        <ffffffff80209aa0>{ret_from_intr+0} <EOI> <ffffffff803a02a0>{thread_return+0}
[ 1128.077926]        <ffffffff802082a8>{default_idle+45} <ffffffff80208335>{cpu_idle+98}
[ 1128.085883]        <ffffffff804d5c22>{start_secondary+1127}
[ 1128.091670] 
[ 1128.091671] Code: f0 ff 0f 0f 88 c8 01 00 00 c3 f0 ff 0f 8b 07 ba 01 00 00 00 
[ 1128.100588] RIP <ffffffff803a1ae8>{__lock_text_start+0} RSP <ffff81000245bd70>
[ 1128.107832] CR2: 0000000000000040
[ 1128.111378]  <0>Kernel panic - not syncing: Aiee, killing interrupt handler!
[ 1128.118498]  <0>Rebooting in 60 seconds..

This is with the remove-softlockup-from-invalidate_mapping_pages patch, but
it looks like it's crashing somewhere else. Any good ideas? Is this related
to the memory management at all, or is is better left to the dm people?

/* Steinar */
-- 
Homepage: http://www.sesse.net/

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] Remove softlockup from invalidate_mapping_pages. (might be dm related)
  2006-05-13 13:49         ` [PATCH] Remove softlockup from invalidate_mapping_pages. (might be dm related) Steinar H. Gunderson
@ 2006-05-13 14:33           ` Andrew Morton
  2006-05-13 14:43             ` Steinar H. Gunderson
  0 siblings, 1 reply; 27+ messages in thread
From: Andrew Morton @ 2006-05-13 14:33 UTC (permalink / raw)
  To: Steinar H. Gunderson; +Cc: linux-kernel

"Steinar H. Gunderson" <sgunderson@bigfoot.com> wrote:
>
> On Wed, Apr 26, 2006 at 01:58:09PM -0700, Andrew Morton wrote:
> >> I tried this patch against 2.6.17-rc2 (I hoped that it might be fixing my
> >> kswapd oopses too, as they seem related; see
> >> http://lkml.org/lkml/2006/4/26/124 and followups), and it simply makes my
> >> machine hang on bootup -- it seems to make modprobe hang forever on some lock
> >> or something right after it loads raid6.ko (pulled in by evms_activate) in
> >> initramfs. Without the patch, the machine boots just fine.
> > It had a silly bug.  Fixed version:
> 
> That worked fine for a long time (>14 days), but I had to switch motherboards
> (from a cheap Epox to a Tyan server-board) due to external factors. Since
> then, stuff has started panicing again wildly -- 2.6.17-rc2 doesn't even
> boot, 2.6.17-rc3 and 2.6.17-rc4 lives for an hour or so and then gives up:
> 
> [ 1127.842645] Unable to handle kernel NULL pointer dereference at 0000000000000040 RIP: 
> [ 1127.848117] <ffffffff803a1ae8>{__lock_text_start+0}
> [ 1127.855474] PGD 5e38a067 PUD 5e39d067 PMD 0 
> [ 1127.859770] Oops: 0002 [1] SMP 
> [ 1127.862931] CPU 1 
> [ 1127.864957] Modules linked in: w83627hf_wdt eeprom ide_generic ide_disk ide_cd cdrom ipv6 psmouse i2c_nforce2 serio_raw pcspkr i2c_core parport_pc parport rtc e
> xt3 jbd mbcahe raid6 raid5 xor raid10 raid1 raid0 linear mdd dm_mod sd_mod sata_nv sata_sil libata scsi_mod amd74xx generic forcedeth tg3 ide_core ohci_hcd ehci_hc
> d thermal processor fan unix
> [ 1127.896622] Pid: 0, comm: swapper Not tainted 2.6.17-rc4 #1
> [ 1127.902191] RIP: 0010:[<ffffffff803a1ae8>] <ffffffff803a1ae8>{__lock_text_start+0}
> [ 1127.909589] RSP: 0018:ffff81000245bd70  EFLAGS: 00010002
> [ 1127.915094] RAX: 0000000000005d09 RBX: 0000000000005d09 RCX: ffff8100020446a8
> [ 1127.922221] RDX: ffff81007e2ee800 RSI: ffff810002044648 RDI: 0000000000000040
> [ 1127.929346] RBP: 0000000000005d09 R08: 0000000000000000 R09: 0000000000000000
> [ 1127.936472] R10: 0000000000000001 R11: ffffffff8024c868 R12: ffff81007ddb41c0
> [ 1127.943599] R13: 0000000000000296 R14: ffff81007caaf650 R15: 0000000000000000
> [ 1127.950726] FS:  0000000000000000(0000) GS:ffff81007f827840(0000) knlGS:00000000f7d666c0
> [ 1127.958819] CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
> [ 1127.964559] CR2: 0000000000000040 CR3: 0000000062aec000 CR4: 00000000000006e0
> [ 1127.971686] Process swapper (pid: 0, threadinfo ffff810002452000, task ffff810002444080)
> [ 1127.979777] Stack: ffffffff802668fb ffff81007caaf650 ffffffff880d3a29 ffff81007caaf650 
> [ 1127.987634]        0000000000000000 ffff81007cf689f0 ffff81003d41d240 0000000000000000 
> [ 1127.995684]        ffffffff880d3b5f ffff81007e3530e8 
> [ 1128.000757] Call Trace: <IRQ> <ffffffff802668fb>{kmem_cache_free+253}
> [ 1128.007231]        <ffffffff880d3a29>{:dm_mod:dec_pending+169} <ffffffff880d3b5f>{:dm_mod:clone_endio+127}
> [ 1128.016932]        <ffffffff802c9372>{__end_that_request_first+420} <ffffffff8808e8a6>{:scsi_mod:scsi_end_request+40}
> [ 1128.027589]        <ffffffff8808eb51>{:scsi_mod:scsi_io_completion+522}
> [ 1128.034222]        <ffffffff880cc4a1>{:sd_mod:sd_rw_intr+623} <ffffffff8808f5d6>{:scsi_mod:scsi_device_unbusy+85}
> [ 1128.044527]        <ffffffff802c86cb>{blk_done_softirq+113} <ffffffff8022c41b>{__do_softirq+86}
> [ 1128.053265]        <ffffffff8020a742>{call_softirq+30} <ffffffff8020b902>{do_softirq+44}
> [ 1128.061396]        <ffffffff8020b947>{do_IRQ+65} <ffffffff8020827b>{default_idle+0}
> [ 1128.069091]        <ffffffff80209aa0>{ret_from_intr+0} <EOI> <ffffffff803a02a0>{thread_return+0}
> [ 1128.077926]        <ffffffff802082a8>{default_idle+45} <ffffffff80208335>{cpu_idle+98}
> [ 1128.085883]        <ffffffff804d5c22>{start_secondary+1127}
> [ 1128.091670] 
> [ 1128.091671] Code: f0 ff 0f 0f 88 c8 01 00 00 c3 f0 ff 0f 8b 07 ba 01 00 00 00 
> [ 1128.100588] RIP <ffffffff803a1ae8>{__lock_text_start+0} RSP <ffff81000245bd70>
> [ 1128.107832] CR2: 0000000000000040
> [ 1128.111378]  <0>Kernel panic - not syncing: Aiee, killing interrupt handler!
> [ 1128.118498]  <0>Rebooting in 60 seconds..
> 
> This is with the remove-softlockup-from-invalidate_mapping_pages patch, but
> it looks like it's crashing somewhere else. Any good ideas? Is this related
> to the memory management at all, or is is better left to the dm people?
> 

Well if it's the same software lineup on new hardware, one would also
suspect that hardware.  Is it new?

Does it run other kernels OK?

Does it always crash in the same manner?

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] Remove softlockup from invalidate_mapping_pages. (might be dm related)
  2006-05-13 14:33           ` Andrew Morton
@ 2006-05-13 14:43             ` Steinar H. Gunderson
  2006-05-13 14:51               ` Andrew Morton
  0 siblings, 1 reply; 27+ messages in thread
From: Steinar H. Gunderson @ 2006-05-13 14:43 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel

On Sat, May 13, 2006 at 07:33:44AM -0700, Andrew Morton wrote:
> Well if it's the same software lineup on new hardware, one would also
> suspect that hardware.  Is it new?

Yes, it's new. The differences aren't that big, though: The motherboard has
been changed, and there's an extra sil3114 controller.

> Does it run other kernels OK?

2.6.15.4 appears to be fine, but I haven't tested it enough to make sure.
(I'm running 2.6.17-rc4 without swap now, so we'll see.)

> Does it always crash in the same manner?

Yes; consistently and in the same place after about the same amount of time.

/* Steinar */
-- 
Homepage: http://www.sesse.net/

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] Remove softlockup from invalidate_mapping_pages. (might be dm related)
  2006-05-13 14:43             ` Steinar H. Gunderson
@ 2006-05-13 14:51               ` Andrew Morton
  2006-05-13 15:00                 ` Steinar H. Gunderson
  0 siblings, 1 reply; 27+ messages in thread
From: Andrew Morton @ 2006-05-13 14:51 UTC (permalink / raw)
  To: Steinar H. Gunderson; +Cc: linux-kernel

"Steinar H. Gunderson" <sgunderson@bigfoot.com> wrote:
>
> On Sat, May 13, 2006 at 07:33:44AM -0700, Andrew Morton wrote:
> > Well if it's the same software lineup on new hardware, one would also
> > suspect that hardware.  Is it new?
> 
> Yes, it's new. The differences aren't that big, though: The motherboard has
> been changed, and there's an extra sil3114 controller.
> 
> > Does it run other kernels OK?
> 
> 2.6.15.4 appears to be fine, but I haven't tested it enough to make sure.
> (I'm running 2.6.17-rc4 without swap now, so we'll see.)
> 
> > Does it always crash in the same manner?
> 
> Yes; consistently and in the same place after about the same amount of time.

ho-hum.  Please see if there's anything else you can do to rule out a
hardware failure, then copy dm-devel@redhat.com on the next oops report.

The stack backtrace you have there is a little surprising.  Enabling
CONFIG_FRAME_POINTER might help clear it up.  Also it'd be worth seeing if
CONFIG_DEBUG_SLAB turns up anything.

Thanks.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] Remove softlockup from invalidate_mapping_pages. (might be dm related)
  2006-05-13 14:51               ` Andrew Morton
@ 2006-05-13 15:00                 ` Steinar H. Gunderson
  2006-05-13 15:24                     ` Andrew Morton
  0 siblings, 1 reply; 27+ messages in thread
From: Steinar H. Gunderson @ 2006-05-13 15:00 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, dm-devel

On Sat, May 13, 2006 at 07:51:47AM -0700, Andrew Morton wrote:
> ho-hum.  Please see if there's anything else you can do to rule out a
> hardware failure, then copy dm-devel@redhat.com on the next oops report.

It's not swap related. It crashes even without swap enabled; still with lots of
dm stuff in the backtrace, though:

[ 3192.568880] general protection fault: 0000 [1] SMP 
[ 3192.573779] CPU 1 
[ 3192.575804] Modules linked in: w83627hf_wdt eeprom ide_generic ide_disk ide_cd cdrom ipv6 psmouse i2c_nforce2 serio_raw pcspkr i2c_core parport_pc parport rtc ext3 jbd mbcache raid6 raid5 xor raid10 raid1 raid0 linear md_mod dm_mod sd_mod sata_nv tg3 sata_sil libata scsi_mod forcedeth generic amd74xx ehci_hcd ide_core ohc i_hcd thermal processor fan unix
[ 3192.607472] Pid: 3432, comm: md1_raid5 Not tainted 2.6.17-rc4 #1
[ 3192.613471] RIP: 0010:[<ffffffff803a1ae8>] <ffffffff803a1ae8>{__lock_text_start+0}
[ 3192.620870] RSP: 0018:ffff81000245bd70  EFLAGS: 00210086
[ 3192.626374] RAX: 000000000000fa40 RBX: aaaa8b5ad269b80f RCX: ffff81000151ff50
[ 3192.633501] RDX: ffff81007febd600 RSI: ffff81000151fef0 RDI: aaaa8b5ad269b81f
[ 3192.640628] RBP: 000000000000fa40 R08: ffff81007db6d2c0 R09: ffff81007db6d2c0
[ 3192.647756] R10: 0000000000000007 R11: ffffffff8024c868 R12: ffff81007febf040
[ 3192.654882] R13: 0000000000200296 R14: ffff810004a97fb0 R15: 0000000000000000
[ 3192.662009] FS:  0000000000000000(0000) GS:ffff81007f827840(0000) knlGS:00000000f7ad9ae0
[ 3192.670101] CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
[ 3192.675843] CR2: 00000000080850e0 CR3: 000000007c11e000 CR4: 00000000000006e0
[ 3192.682970] Process md1_raid5 (pid: 3432, threadinfo ffff81007ddb6000, task ffff81007f997080)
[ 3192.691495] Stack: ffffffff802668b8 ffff81007f27c880 ffff810049bb23c0 ffff810049bb23c0 
[ 3192.699353]        0000000000000000 ffff81007d0788b8 ffff810049bb23c0 0000000000000000 
[ 3192.707404]        ffffffff880d3b67 ffff81007ed0cba8 
[ 3192.712476] Call Trace: <IRQ> <ffffffff802668b8>{kmem_cache_free+186}
[ 3192.718950]        <ffffffff880d3b67>{:dm_mod:clone_endio+135} <ffffffff802c9372>{__end_that_request_first+420}
[ 3192.729081]        <ffffffff802c7d1b>{blk_run_queue+62} <ffffffff8806f8a6>{:scsi_mod:scsi_end_request+40}
[ 3192.738700]        <ffffffff8806fb51>{:scsi_mod:scsi_io_completion+522}
[ 3192.745334]        <ffffffff880cc4a1>{:sd_mod:sd_rw_intr+623} <ffffffff880705d6>{:scsi_mod:scsi_device_unbusy+85}
[ 3192.755641]        <ffffffff802c86cb>{blk_done_softirq+113} <ffffffff8022c41b>{__do_softirq+86}
[ 3192.764377]        <ffffffff8020a742>{call_softirq+30} <ffffffff8020b902>{do_softirq+44}
[ 3192.772509]        <ffffffff8020b947>{do_IRQ+65} <ffffffff80209aa0>{ret_from_intr+0} <EOI>
[ 3192.780822]        <ffffffff881129a7>{:raid5:compute_parity+880} <ffffffff802d5e2f>{memcmp+11}
[ 3192.789476]        <ffffffff8811467a>{:raid5:handle_stripe+3022} <ffffffff80238a7c>{keventd_create_kthread+0}
[ 3192.799424]        <ffffffff80238a7c>{keventd_create_kthread+0} <ffffffff881153e9>{:raid5:raid5d+333}
[ 3192.808682]        <ffffffff880e464f>{:md_mod:md_thread+0} <ffffffff880e4751>{:md_mod:md_thread+258}
[ 3192.817864]        <ffffffff80238e78>{autoremove_wake_function+0} <ffffffff880e464f>{:md_mod:md_thread+0}
[ 3192.827471]        <ffffffff80238cc4>{kthread+203} <ffffffff8020a3f2>{child_rip+8}
[ 3192.835081]        <ffffffff80238a7c>{keventd_create_kthread+0} <ffffffff80238bf9>{kthread+0}
[ 3192.843646]        <ffffffff8020a3ea>{child_rip+0}
[ 3192.848646] 
[ 3192.848647] Code: f0 ff 0f 0f 88 c8 01 00 00 c3 f0 ff 0f 8b 07 ba 01 00 00 00 
[ 3192.857563] RIP <ffffffff803a1ae8>{__lock_text_start+0} RSP <ffff81000245bd70>
[ 3192.865039]  <0>Kernel panic - not syncing: Aiee, killing interrupt handler!
[ 3192.872164]  <0>Rebooting in 60 seconds..

(Thank goodness for serial console; I couldn't possibly write all these oopses
by hand. :-) )

> The stack backtrace you have there is a little surprising.  Enabling
> CONFIG_FRAME_POINTER might help clear it up.  Also it'd be worth seeing if
> CONFIG_DEBUG_SLAB turns up anything.

I'm recompiling 2.6.17-rc4 now with those two added in. I'll let you know in a
few hours when it crashes again, I'd guess :-)

Would it be a good idea to revert your mm patch and test again, just in case?

/* Steinar */
-- 
Homepage: http://www.sesse.net/

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] Remove softlockup from invalidate_mapping_pages. (might be dm related)
  2006-05-13 15:00                 ` Steinar H. Gunderson
@ 2006-05-13 15:24                     ` Andrew Morton
  0 siblings, 0 replies; 27+ messages in thread
From: Andrew Morton @ 2006-05-13 15:24 UTC (permalink / raw)
  To: Steinar H. Gunderson; +Cc: linux-kernel, dm-devel

"Steinar H. Gunderson" <sgunderson@bigfoot.com> wrote:
>
> On Sat, May 13, 2006 at 07:51:47AM -0700, Andrew Morton wrote:
> > ho-hum.  Please see if there's anything else you can do to rule out a
> > hardware failure, then copy dm-devel@redhat.com on the next oops report.
> 
> It's not swap related. It crashes even without swap enabled; still with lots of
> dm stuff in the backtrace, though:

(for dm-devel: 2.6.15.4 runs OK on this machine with the same config) (yes?)

> [ 3192.568880] general protection fault: 0000 [1] SMP 
> [ 3192.573779] CPU 1 
> [ 3192.575804] Modules linked in: w83627hf_wdt eeprom ide_generic ide_disk ide_cd cdrom ipv6 psmouse i2c_nforce2 serio_raw pcspkr i2c_core parport_pc parport rtc ext3 jbd mbcache raid6 raid5 xor raid10 raid1 raid0 linear md_mod dm_mod sd_mod sata_nv tg3 sata_sil libata scsi_mod forcedeth generic amd74xx ehci_hcd ide_core ohc i_hcd thermal processor fan unix
> [ 3192.607472] Pid: 3432, comm: md1_raid5 Not tainted 2.6.17-rc4 #1
> [ 3192.613471] RIP: 0010:[<ffffffff803a1ae8>] <ffffffff803a1ae8>{__lock_text_start+0}
> [ 3192.620870] RSP: 0018:ffff81000245bd70  EFLAGS: 00210086
> [ 3192.626374] RAX: 000000000000fa40 RBX: aaaa8b5ad269b80f RCX: ffff81000151ff50
> [ 3192.633501] RDX: ffff81007febd600 RSI: ffff81000151fef0 RDI: aaaa8b5ad269b81f
> [ 3192.640628] RBP: 000000000000fa40 R08: ffff81007db6d2c0 R09: ffff81007db6d2c0
> [ 3192.647756] R10: 0000000000000007 R11: ffffffff8024c868 R12: ffff81007febf040
> [ 3192.654882] R13: 0000000000200296 R14: ffff810004a97fb0 R15: 0000000000000000
> [ 3192.662009] FS:  0000000000000000(0000) GS:ffff81007f827840(0000) knlGS:00000000f7ad9ae0
> [ 3192.670101] CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
> [ 3192.675843] CR2: 00000000080850e0 CR3: 000000007c11e000 CR4: 00000000000006e0
> [ 3192.682970] Process md1_raid5 (pid: 3432, threadinfo ffff81007ddb6000, task ffff81007f997080)
> [ 3192.691495] Stack: ffffffff802668b8 ffff81007f27c880 ffff810049bb23c0 ffff810049bb23c0 
> [ 3192.699353]        0000000000000000 ffff81007d0788b8 ffff810049bb23c0 0000000000000000 
> [ 3192.707404]        ffffffff880d3b67 ffff81007ed0cba8 
> [ 3192.712476] Call Trace: <IRQ> <ffffffff802668b8>{kmem_cache_free+186}
> [ 3192.718950]        <ffffffff880d3b67>{:dm_mod:clone_endio+135} <ffffffff802c9372>{__end_that_request_first+420}
> [ 3192.729081]        <ffffffff802c7d1b>{blk_run_queue+62} <ffffffff8806f8a6>{:scsi_mod:scsi_end_request+40}
> [ 3192.738700]        <ffffffff8806fb51>{:scsi_mod:scsi_io_completion+522}
> [ 3192.745334]        <ffffffff880cc4a1>{:sd_mod:sd_rw_intr+623} <ffffffff880705d6>{:scsi_mod:scsi_device_unbusy+85}
> [ 3192.755641]        <ffffffff802c86cb>{blk_done_softirq+113} <ffffffff8022c41b>{__do_softirq+86}
> [ 3192.764377]        <ffffffff8020a742>{call_softirq+30} <ffffffff8020b902>{do_softirq+44}
> [ 3192.772509]        <ffffffff8020b947>{do_IRQ+65} <ffffffff80209aa0>{ret_from_intr+0} <EOI>
> [ 3192.780822]        <ffffffff881129a7>{:raid5:compute_parity+880} <ffffffff802d5e2f>{memcmp+11}
> [ 3192.789476]        <ffffffff8811467a>{:raid5:handle_stripe+3022} <ffffffff80238a7c>{keventd_create_kthread+0}
> [ 3192.799424]        <ffffffff80238a7c>{keventd_create_kthread+0} <ffffffff881153e9>{:raid5:raid5d+333}
> [ 3192.808682]        <ffffffff880e464f>{:md_mod:md_thread+0} <ffffffff880e4751>{:md_mod:md_thread+258}
> [ 3192.817864]        <ffffffff80238e78>{autoremove_wake_function+0} <ffffffff880e464f>{:md_mod:md_thread+0}
> [ 3192.827471]        <ffffffff80238cc4>{kthread+203} <ffffffff8020a3f2>{child_rip+8}
> [ 3192.835081]        <ffffffff80238a7c>{keventd_create_kthread+0} <ffffffff80238bf9>{kthread+0}
> [ 3192.843646]        <ffffffff8020a3ea>{child_rip+0}
> [ 3192.848646] 
> [ 3192.848647] Code: f0 ff 0f 0f 88 c8 01 00 00 c3 f0 ff 0f 8b 07 ba 01 00 00 00 
> [ 3192.857563] RIP <ffffffff803a1ae8>{__lock_text_start+0} RSP <ffff81000245bd70>
> [ 3192.865039]  <0>Kernel panic - not syncing: Aiee, killing interrupt handler!
> [ 3192.872164]  <0>Rebooting in 60 seconds..
> 
> (Thank goodness for serial console; I couldn't possibly write all these oopses
> by hand. :-) )
> 
> > The stack backtrace you have there is a little surprising.  Enabling
> > CONFIG_FRAME_POINTER might help clear it up.  Also it'd be worth seeing if
> > CONFIG_DEBUG_SLAB turns up anything.
> 
> I'm recompiling 2.6.17-rc4 now with those two added in. I'll let you know in a
> few hours when it crashes again, I'd guess :-)
> 
> Would it be a good idea to revert your mm patch and test again, just in case?

Which patch?  remove-softlockup-from-invalidate_mapping_pages.patch?  No,
that won't have caused this.  But then, it's not really obvious what this
crash is.

Now your earlier trace had this important info:

 [ 1127.842645] Unable to handle kernel NULL pointer dereference at 0000000000000040 RIP: 
 [ 1127.848117] <ffffffff803a1ae8>{__lock_text_start+0}
 [ 1127.855474] PGD 5e38a067 PUD 5e39d067 PMD 0 
 [ 1127.859770] Oops: 0002 [1] SMP 
 [ 1127.862931] CPU 1 
 

Which kind of implies that we passed a null (or very small small) `struct
kmem_cache' pointer into kmem_cache_free().  But that doesn't seem like the
sort of thing which will take hours to reproduce.

Do you have CONFIG_NUMA set?


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] Remove softlockup from invalidate_mapping_pages. (might be dm related)
@ 2006-05-13 15:24                     ` Andrew Morton
  0 siblings, 0 replies; 27+ messages in thread
From: Andrew Morton @ 2006-05-13 15:24 UTC (permalink / raw)
  To: Steinar H. Gunderson; +Cc: dm-devel, linux-kernel

"Steinar H. Gunderson" <sgunderson@bigfoot.com> wrote:
>
> On Sat, May 13, 2006 at 07:51:47AM -0700, Andrew Morton wrote:
> > ho-hum.  Please see if there's anything else you can do to rule out a
> > hardware failure, then copy dm-devel@redhat.com on the next oops report.
> 
> It's not swap related. It crashes even without swap enabled; still with lots of
> dm stuff in the backtrace, though:

(for dm-devel: 2.6.15.4 runs OK on this machine with the same config) (yes?)

> [ 3192.568880] general protection fault: 0000 [1] SMP 
> [ 3192.573779] CPU 1 
> [ 3192.575804] Modules linked in: w83627hf_wdt eeprom ide_generic ide_disk ide_cd cdrom ipv6 psmouse i2c_nforce2 serio_raw pcspkr i2c_core parport_pc parport rtc ext3 jbd mbcache raid6 raid5 xor raid10 raid1 raid0 linear md_mod dm_mod sd_mod sata_nv tg3 sata_sil libata scsi_mod forcedeth generic amd74xx ehci_hcd ide_core ohc i_hcd thermal processor fan unix
> [ 3192.607472] Pid: 3432, comm: md1_raid5 Not tainted 2.6.17-rc4 #1
> [ 3192.613471] RIP: 0010:[<ffffffff803a1ae8>] <ffffffff803a1ae8>{__lock_text_start+0}
> [ 3192.620870] RSP: 0018:ffff81000245bd70  EFLAGS: 00210086
> [ 3192.626374] RAX: 000000000000fa40 RBX: aaaa8b5ad269b80f RCX: ffff81000151ff50
> [ 3192.633501] RDX: ffff81007febd600 RSI: ffff81000151fef0 RDI: aaaa8b5ad269b81f
> [ 3192.640628] RBP: 000000000000fa40 R08: ffff81007db6d2c0 R09: ffff81007db6d2c0
> [ 3192.647756] R10: 0000000000000007 R11: ffffffff8024c868 R12: ffff81007febf040
> [ 3192.654882] R13: 0000000000200296 R14: ffff810004a97fb0 R15: 0000000000000000
> [ 3192.662009] FS:  0000000000000000(0000) GS:ffff81007f827840(0000) knlGS:00000000f7ad9ae0
> [ 3192.670101] CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
> [ 3192.675843] CR2: 00000000080850e0 CR3: 000000007c11e000 CR4: 00000000000006e0
> [ 3192.682970] Process md1_raid5 (pid: 3432, threadinfo ffff81007ddb6000, task ffff81007f997080)
> [ 3192.691495] Stack: ffffffff802668b8 ffff81007f27c880 ffff810049bb23c0 ffff810049bb23c0 
> [ 3192.699353]        0000000000000000 ffff81007d0788b8 ffff810049bb23c0 0000000000000000 
> [ 3192.707404]        ffffffff880d3b67 ffff81007ed0cba8 
> [ 3192.712476] Call Trace: <IRQ> <ffffffff802668b8>{kmem_cache_free+186}
> [ 3192.718950]        <ffffffff880d3b67>{:dm_mod:clone_endio+135} <ffffffff802c9372>{__end_that_request_first+420}
> [ 3192.729081]        <ffffffff802c7d1b>{blk_run_queue+62} <ffffffff8806f8a6>{:scsi_mod:scsi_end_request+40}
> [ 3192.738700]        <ffffffff8806fb51>{:scsi_mod:scsi_io_completion+522}
> [ 3192.745334]        <ffffffff880cc4a1>{:sd_mod:sd_rw_intr+623} <ffffffff880705d6>{:scsi_mod:scsi_device_unbusy+85}
> [ 3192.755641]        <ffffffff802c86cb>{blk_done_softirq+113} <ffffffff8022c41b>{__do_softirq+86}
> [ 3192.764377]        <ffffffff8020a742>{call_softirq+30} <ffffffff8020b902>{do_softirq+44}
> [ 3192.772509]        <ffffffff8020b947>{do_IRQ+65} <ffffffff80209aa0>{ret_from_intr+0} <EOI>
> [ 3192.780822]        <ffffffff881129a7>{:raid5:compute_parity+880} <ffffffff802d5e2f>{memcmp+11}
> [ 3192.789476]        <ffffffff8811467a>{:raid5:handle_stripe+3022} <ffffffff80238a7c>{keventd_create_kthread+0}
> [ 3192.799424]        <ffffffff80238a7c>{keventd_create_kthread+0} <ffffffff881153e9>{:raid5:raid5d+333}
> [ 3192.808682]        <ffffffff880e464f>{:md_mod:md_thread+0} <ffffffff880e4751>{:md_mod:md_thread+258}
> [ 3192.817864]        <ffffffff80238e78>{autoremove_wake_function+0} <ffffffff880e464f>{:md_mod:md_thread+0}
> [ 3192.827471]        <ffffffff80238cc4>{kthread+203} <ffffffff8020a3f2>{child_rip+8}
> [ 3192.835081]        <ffffffff80238a7c>{keventd_create_kthread+0} <ffffffff80238bf9>{kthread+0}
> [ 3192.843646]        <ffffffff8020a3ea>{child_rip+0}
> [ 3192.848646] 
> [ 3192.848647] Code: f0 ff 0f 0f 88 c8 01 00 00 c3 f0 ff 0f 8b 07 ba 01 00 00 00 
> [ 3192.857563] RIP <ffffffff803a1ae8>{__lock_text_start+0} RSP <ffff81000245bd70>
> [ 3192.865039]  <0>Kernel panic - not syncing: Aiee, killing interrupt handler!
> [ 3192.872164]  <0>Rebooting in 60 seconds..
> 
> (Thank goodness for serial console; I couldn't possibly write all these oopses
> by hand. :-) )
> 
> > The stack backtrace you have there is a little surprising.  Enabling
> > CONFIG_FRAME_POINTER might help clear it up.  Also it'd be worth seeing if
> > CONFIG_DEBUG_SLAB turns up anything.
> 
> I'm recompiling 2.6.17-rc4 now with those two added in. I'll let you know in a
> few hours when it crashes again, I'd guess :-)
> 
> Would it be a good idea to revert your mm patch and test again, just in case?

Which patch?  remove-softlockup-from-invalidate_mapping_pages.patch?  No,
that won't have caused this.  But then, it's not really obvious what this
crash is.

Now your earlier trace had this important info:

 [ 1127.842645] Unable to handle kernel NULL pointer dereference at 0000000000000040 RIP: 
 [ 1127.848117] <ffffffff803a1ae8>{__lock_text_start+0}
 [ 1127.855474] PGD 5e38a067 PUD 5e39d067 PMD 0 
 [ 1127.859770] Oops: 0002 [1] SMP 
 [ 1127.862931] CPU 1 
 

Which kind of implies that we passed a null (or very small small) `struct
kmem_cache' pointer into kmem_cache_free().  But that doesn't seem like the
sort of thing which will take hours to reproduce.

Do you have CONFIG_NUMA set?

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] Remove softlockup from invalidate_mapping_pages. (might be dm related)
  2006-05-13 15:24                     ` Andrew Morton
  (?)
@ 2006-05-13 15:32                     ` Steinar H. Gunderson
  2006-05-13 15:43                         ` Andrew Morton
  -1 siblings, 1 reply; 27+ messages in thread
From: Steinar H. Gunderson @ 2006-05-13 15:32 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, dm-devel

On Sat, May 13, 2006 at 08:24:09AM -0700, Andrew Morton wrote:
> (for dm-devel: 2.6.15.4 runs OK on this machine with the same config) (yes?)

I can't really assert that; it doesn't _seem_ to be bothered the same way,
but I've only run it for an hour or so to compile the other kernels.

Just to be 100% sure it's not overheating or such, I've turned off all fan
control in the BIOS. But then again, overheating sounds really far off; the
machine is in a cooled server room (20 degrees Celsius), and has lots of fans
making sure the air flow is good.

> Which patch?  remove-softlockup-from-invalidate_mapping_pages.patch?  No,
> that won't have caused this.  But then, it's not really obvious what this
> crash is.

OK.

> Which kind of implies that we passed a null (or very small small) `struct
> kmem_cache' pointer into kmem_cache_free().  But that doesn't seem like the
> sort of thing which will take hours to reproduce.
> 
> Do you have CONFIG_NUMA set?

Hm, yes, for some reason CONFIG_NUMA, CONFIG_K8_NUMA and
CONFIG_x86_64_ACPI_NUMA are all set; I guess they're left from the stock
config I use at a base. (For those tuning in; this is a dual-core amd64. That
might be relevant somehow.)

/* Steinar */
-- 
Homepage: http://www.sesse.net/

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] Remove softlockup from invalidate_mapping_pages. (might be dm related)
  2006-05-13 15:32                     ` Steinar H. Gunderson
@ 2006-05-13 15:43                         ` Andrew Morton
  0 siblings, 0 replies; 27+ messages in thread
From: Andrew Morton @ 2006-05-13 15:43 UTC (permalink / raw)
  To: Steinar H. Gunderson; +Cc: linux-kernel, dm-devel

"Steinar H. Gunderson" <sgunderson@bigfoot.com> wrote:
>
>  > Which kind of implies that we passed a null (or very small small) `struct
>  > kmem_cache' pointer into kmem_cache_free().  But that doesn't seem like the
>  > sort of thing which will take hours to reproduce.
>  > 
>  > Do you have CONFIG_NUMA set?
> 
>  Hm, yes, for some reason CONFIG_NUMA, CONFIG_K8_NUMA and
>  CONFIG_x86_64_ACPI_NUMA are all set; I guess they're left from the stock
>  config I use at a base. (For those tuning in; this is a dual-core amd64. That
>  might be relevant somehow.)

Please try it without NUMA - we've had a few problems there of late.  If
that fixes it then I have patches for you to test ;)


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] Remove softlockup from invalidate_mapping_pages. (might be dm related)
@ 2006-05-13 15:43                         ` Andrew Morton
  0 siblings, 0 replies; 27+ messages in thread
From: Andrew Morton @ 2006-05-13 15:43 UTC (permalink / raw)
  To: Steinar H. Gunderson; +Cc: dm-devel, linux-kernel

"Steinar H. Gunderson" <sgunderson@bigfoot.com> wrote:
>
>  > Which kind of implies that we passed a null (or very small small) `struct
>  > kmem_cache' pointer into kmem_cache_free().  But that doesn't seem like the
>  > sort of thing which will take hours to reproduce.
>  > 
>  > Do you have CONFIG_NUMA set?
> 
>  Hm, yes, for some reason CONFIG_NUMA, CONFIG_K8_NUMA and
>  CONFIG_x86_64_ACPI_NUMA are all set; I guess they're left from the stock
>  config I use at a base. (For those tuning in; this is a dual-core amd64. That
>  might be relevant somehow.)

Please try it without NUMA - we've had a few problems there of late.  If
that fixes it then I have patches for you to test ;)

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] Remove softlockup from invalidate_mapping_pages. (might be dm related)
  2006-05-13 15:43                         ` Andrew Morton
  (?)
@ 2006-05-13 17:14                         ` Steinar H. Gunderson
  2006-05-13 23:28                           ` Steinar H. Gunderson
  -1 siblings, 1 reply; 27+ messages in thread
From: Steinar H. Gunderson @ 2006-05-13 17:14 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, dm-devel

On Sat, May 13, 2006 at 08:43:17AM -0700, Andrew Morton wrote:
> Please try it without NUMA - we've had a few problems there of late.  If
> that fixes it then I have patches for you to test ;)

No go:

[ 2364.512764] Unable to handle kernel paging request at ffffffff403ae552 RIP: 
[ 2364.517369] [<ffffffff403ae552>]
[ 2364.526396] PGD 203027 PUD 0 
[ 2364.529385] Oops: 0010 [1] SMP 
[ 2364.532546] CPU 0 
[ 2364.534571] Modules linked in: w83627hf_wdt eeprom ide_generic ide_disk ide_cd cdrom ipv6 psmouse i2c_nforce2 serio_raw pcspkr evdev parport_pc parport i2c_core rtc ext3 jbd mbcache raid6 raid5 xor rai
d10 raid1 raid0 linear md_mod dm_mod sd_mod sata_nv sata_sil libata tg3 scsi_mod forcedeth generic amd74xx ide_core ehci_hcd ohci_hcd thermal processor fan unix
[ 2364.566790] Pid: 3433, comm: md1_resync Not tainted 2.6.17-rc4 #1
[ 2364.572877] RIP: 0010:[<ffffffff403ae552>] [<ffffffff403ae552>]
[ 2364.578612] RSP: 0018:ffff81007d5d3c40  EFLAGS: 00010287
[ 2364.584118] RAX: 0000000000000010 RBX: ffff8100109cce90 RCX: 0000000000000290
[ 2364.591246] RDX: 000000000000000a RSI: 0000000000000282 RDI: ffff81007df604d8
[ 2364.598372] RBP: ffff81007d5d3c68 R08: 0000000000001d4b R09: ffff810076aa9e78
[ 2364.605499] R10: ffff810076aa9e78 R11: 0000000000001000 R12: ffff81007dca7448
[ 2364.612627] R13: ffff81007df60488 R14: ffff81007df73528 R15: ffff81007dca7448
[ 2364.619755] FS:  0000000000000000(0000) GS:ffffffff804c3000(0000) knlGS:00000000f7e968e0
[ 2364.627848] CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
[ 2364.633588] CR2: ffffffff403ae552 CR3: 000000007b91c000 CR4: 00000000000006e0
[ 2364.640715] Process md1_resync (pid: 3433, threadinfo ffff81007d5d2000, task ffff81007fcb47d0)
[ 2364.649325] Stack: ffffffff8806a93b ffff8100109cce90 ffff81007dc304c8 ffff81007df60488 
[ 2364.657182]        ffff81007df73528 ffff81007d5d3ca8 ffffffff8807070e 0000000000000001 
[ 2364.665238]        ffff81007df73528 ffff81007dd8b9a8 
[ 2364.670310] Call Trace: <ffffffff8806a93b>{:scsi_mod:scsi_dispatch_cmd+381}
[ 2364.677308]        <ffffffff8807070e>{:scsi_mod:scsi_request_fn+735} <ffffffff802cbdad>{__generic_unplug_device+36}
[ 2364.687790]        <ffffffff802cbed9>{generic_unplug_device+0} <ffffffff802cbef7>{generic_unplug_device+30}
[ 2364.697570]        <ffffffff880d753c>{:dm_mod:dm_table_unplug_all+52} <ffffffff880d51cf>{:dm_mod:dm_unplug_all+35}
[ 2364.707964]        <ffffffff8811272e>{:raid5:unplug_slaves+105} <ffffffff88112a7d>{:raid5:raid5_unplug_device+107}
[ 2364.718350]        <ffffffff880ec66a>{:md_mod:md_do_sync+1300} <ffffffff803acb5d>{thread_return+0}
[ 2364.727361]        <ffffffff880e67e8>{:md_mod:md_thread+0} <ffffffff880e68ed>{:md_mod:md_thread+261}
[ 2364.736542]        <ffffffff880e67e8>{:md_mod:md_thread+0} <ffffffff80239e67>{kthread+207}
[ 2364.744858]        <ffffffff8020a522>{child_rip+8} <ffffffff80239d98>{kthread+0}
[ 2364.752300]        <ffffffff8020a51a>{child_rip+0}
[ 2364.757303] 
[ 2364.757304] Code:  Bad RIP value.
[ 2364.762134] RIP [<ffffffff403ae552>] RSP <ffff81007d5d3c40>
[ 2364.767722] CR2: ffffffff403ae552

This is about the same crash as I used to see with -rc2.

I guess it's time for memtest86, just in case I botched the RAM while moving it...

/* Steinar */
-- 
Homepage: http://www.sesse.net/

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [dm-devel] Re: [PATCH] Remove softlockup from invalidate_mapping_pages. (might be dm related)
  2006-05-13 15:43                         ` Andrew Morton
@ 2006-05-13 22:37                           ` Alasdair G Kergon
  -1 siblings, 0 replies; 27+ messages in thread
From: Alasdair G Kergon @ 2006-05-13 22:37 UTC (permalink / raw)
  To: device-mapper development; +Cc: Steinar H. Gunderson, linux-kernel

What's the device configuration here:
   e.g. linear dm logical volume over md raid5?

[e.g. post output of dmsetup info -c' and 'dmsetup table'
and 'dmsetup status' remembering to blank out the
long hex string if you have dm-crypt]

Alasdair
-- 
agk@redhat.com

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Re: [PATCH] Remove softlockup from invalidate_mapping_pages. (might be dm related)
@ 2006-05-13 22:37                           ` Alasdair G Kergon
  0 siblings, 0 replies; 27+ messages in thread
From: Alasdair G Kergon @ 2006-05-13 22:37 UTC (permalink / raw)
  To: device-mapper development; +Cc: Steinar H. Gunderson, linux-kernel

What's the device configuration here:
   e.g. linear dm logical volume over md raid5?

[e.g. post output of dmsetup info -c' and 'dmsetup table'
and 'dmsetup status' remembering to blank out the
long hex string if you have dm-crypt]

Alasdair
-- 
agk@redhat.com

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] Remove softlockup from invalidate_mapping_pages. (might be dm related)
  2006-05-13 17:14                         ` Steinar H. Gunderson
@ 2006-05-13 23:28                           ` Steinar H. Gunderson
  0 siblings, 0 replies; 27+ messages in thread
From: Steinar H. Gunderson @ 2006-05-13 23:28 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, dm-devel, neilb

On Sat, May 13, 2006 at 07:14:14PM +0200, Steinar H. Gunderson wrote:
> I guess it's time for memtest86, just in case I botched the RAM while
> moving it...

Good news! :-) memtest86+ refused to boot, since I have too much memory (2GB)
and this somehow interferes with grub, but mprime (my other favourite tool
for this kind of testing) quite quickly bombed out. (This in itself usually
means bad RAM, bad motherboard or some interrupt handler messing up badly.)

I moved the RAM from DIMM slot 3+4 to 1+2, and suddenly things became a _lot_
better. I definitely can't say it's oops-free yet, since it's only been
running a couple of hours, but this certainly points towards that the larger
part of my issues were some odd sort of motherboard hiccup.

I take it this also frees the dm, md and NUMA code of most suspicion --
consider the issues fixed for now, and I'll be sure to bug you again if I
start to get oopses again. Sorry for wasting your time :-)

/* Steinar */
-- 
Homepage: http://www.sesse.net/

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2006-05-13 23:28 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20060420160549.7637.patches@notabene>
2006-04-20  6:29 ` [PATCH] Remove softlockup from invalidate_mapping_pages NeilBrown
2006-04-20  7:38   ` Andrew Morton
2006-04-20  9:24     ` Neil Brown
2006-04-20  9:30       ` Jens Axboe
2006-04-20  9:50         ` Neil Brown
2006-04-20  9:59           ` Jens Axboe
2006-04-20 10:26             ` Neil Brown
2006-04-20  9:52       ` Andrew Morton
2006-04-20 10:52         ` Jens Axboe
2006-04-26 20:48     ` Steinar H. Gunderson
2006-04-26 20:58       ` Andrew Morton
2006-04-26 21:26         ` Steinar H. Gunderson
2006-05-01 16:07           ` Steinar H. Gunderson
2006-05-13 13:49         ` [PATCH] Remove softlockup from invalidate_mapping_pages. (might be dm related) Steinar H. Gunderson
2006-05-13 14:33           ` Andrew Morton
2006-05-13 14:43             ` Steinar H. Gunderson
2006-05-13 14:51               ` Andrew Morton
2006-05-13 15:00                 ` Steinar H. Gunderson
2006-05-13 15:24                   ` Andrew Morton
2006-05-13 15:24                     ` Andrew Morton
2006-05-13 15:32                     ` Steinar H. Gunderson
2006-05-13 15:43                       ` Andrew Morton
2006-05-13 15:43                         ` Andrew Morton
2006-05-13 17:14                         ` Steinar H. Gunderson
2006-05-13 23:28                           ` Steinar H. Gunderson
2006-05-13 22:37                         ` [dm-devel] " Alasdair G Kergon
2006-05-13 22:37                           ` Alasdair G Kergon

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.